Regression-Adjusted Estimation of Quantile Treatment Effects under Covariate-Adaptive Randomizations
Abstract
Datasets from field experiments with covariate-adaptive randomizations (CARs) usually contain extra covariates in addition to the strata indicators. We propose to incorporate these additional covariates via auxiliary regressions in the estimation and inference of unconditional quantile treatment effects (QTEs) under CARs. We establish the consistency and limit distribution of the regression-adjusted QTE estimator and prove that the use of multiplier bootstrap inference is non-conservative under CARs. The auxiliary regression may be estimated parametrically, nonparametrically, or via regularization when the data are high-dimensional. Even when the auxiliary regression is misspecified, the proposed bootstrap inferential procedure still achieves the nominal rejection probability in the limit under the null. When the auxiliary regression is correctly specified, the regression-adjusted estimator achieves the minimum asymptotic variance. We also discuss forms of adjustments that can improve the efficiency of the QTE estimators. The finite sample performance of the new estimation and inferential methods is studied in simulations, and an empirical application to a well-known dataset concerned with expanding access to basic bank accounts on savings is reported.
Keywords: Covariate-adaptive randomization, High-dimensional data, Regression adjustment, Quantile treatment effects.
JEL codes: C14, C21, D14, G21
1 Introduction
Covariate-adaptive randomizations (CARs) have recently seen growing use in a wide variety of randomized experiments in economic research. Examples include Chong et al. (2016), Greaney et al. (2016), Jakiela and Ozier (2016), Burchardi et al. (2019), Anderson and McKenzie (2021), among many others. In CAR modeling, units are first stratified using some baseline covariates, and then, within each stratum, the treatment status is assigned (independent of covariates) to achieve the balance between the numbers of treated and control units.
In many empirical studies, apart from the average treatment effect (ATE), researchers are often interested in using randomized experiments to estimate quantile treatment effects (QTEs). The QTE has a useful role as a robustness check for the ATE and characterizes any heterogeneity that may be present in the sign and magnitude of the treatment effects according to their position within the distribution of outcomes. See, for example, Bitler et al. (2006), Muralidharan and Sundararaman (2011), Duflo et al. (2013), Banerjee et al. (2015), Crépon et al. (2015), and Campos et al. (2017).
Two practical issues arise in estimation and inference concerning QTEs under CARs. First, other covariates in addition to the strata indicators are collected during the experiment. It is possible to incorporate these covariates in the estimation of treatment effects to reduce variance and improve efficiency. In the estimation of ATE, the usual practice is to run a simple ordinary least squares (OLS) regression of the outcome on treatment status, strata indicators, and additional covariates as in the analysis of covariance (ANCOVA). Freedman (2008a, b) pointed out that such an OLS regression adjustment can degrade the precision of the ATE estimator. Lin (2013) reexamined Freedman’s critique and showed that, in order to improve efficiency, the linear regression adjustment should include a full set of interactions between the treatment status and covariates. However, because the quantile function is a nonlinear operator, even when the assignment of treatment status is completely random, a similar linear quantile regression with a full set of interaction terms is unable to provide a consistent estimate of the unconditional QTE, not to mention the improvement of estimation efficiency. Second, in order to achieve balance in the respective number of treated and control units within each stratum, treatment statuses under CARs usually exhibit a (negative) cross-sectional dependence. Standard inference procedures that rely on cross-sectional independence are therefore conservative and lack power. These two issues raise questions of how to use the additional covariates to consistently and more efficiently estimate QTE in CAR settings and how to conduct valid statistical procedures that mitigate conservatism in inference.111For example, Bugni et al. (2018) and Zhang and Zheng (2020) have shown that the usual two-sample -test for inference concerning ATE and multiplier bootstrap inference concerning QTE are in general conservative under CARs.
The present paper addresses these issues by proposing a regression-adjusted estimator of the QTE, deriving its limit theory, and establishing the validity of multiplier bootstrap inference under CARs. Even under potential misspecification of the auxiliary regressions, the proposed QTE estimator is shown to maintain its consistency, and the multiplier bootstrap procedure is shown to have an asymptotic size equal to the nominal level under the null. When the auxiliary regression is correctly specified, the QTE estimator achieves minimum asymptotic variance.
We further investigate efficiency gains that materialize from the regression adjustments in three scenarios: (1) parametric regressions, (2) nonparametric regressions, and (3) regressions with regularization in high-dimensional settings. Specifically, for parametric regressions with a potentially misspecified linear probability model, we propose to compute the optimal linear coefficient by minimizing the variance of the QTE estimator. Such an adjustment is optimal within the class of linear adjustments but does not necessarily achieve the global minimum asymptotic variance. However, as no adjustment is a special case of the linear regression with all the coefficients being zero, our optimal linear adjustment is guaranteed to be weakly more efficient than the QTE estimator with no adjustments, which addresses Freedman’s critique. We also consider a potentially misspecified logistic regression with fixed-dimensional regressors and strata- and quantile-specific regression coefficients, which is then estimated by the quasi maximum likelihood estimation (QMLE). Although the QMLE does not necessarily minimize the asymptotic variance of the QTE, such a flexible logistic model can closely approximate the true specification. Therefore, in practice, the corresponding regression-adjusted QTE estimator usually has a smaller variance than that with no adjustments. Last, we propose to treat the logistic QMLE adjustments as new linear regressors and re-construct the corresponding optimal linear adjustments. We then show the QTE estimator with the new adjustments are weakly more efficient than that with both the original logistic QMLE adjustments and no adjustments.
In nonparametric regressions, we further justify the QMLE by letting the regressors in the logistic regression be a set of sieve basis functions with increasing dimension and show how such a nonparametric regression-adjusted QTE estimator can achieve the global minimum asymptotic variance. For high-dimensional regressions with regularization, we consider logistic regression under penalization, an approach that also achieves the global minimum asymptotic variance. All the limit theories hold uniformly over a compact set of quantile indices, implying that our multiplier bootstrap procedure can be used to conduct inference on QTEs involving single, multiple, or a continuum of quantile indices.
These results, including the limit distribution of the regression-adjusted QTE estimator and the validity of the multiplier bootstrap, provide novel contributions to the literature in three respects. First, the data generated under CARs are different from observational data as the observed outcomes and treatment statuses are cross-sectionally dependent due to the randomization schemes. Recently Bugni et al. (2018) established a rigorous asymptotic framework to study the ATE estimator under CARs and pointed out the conservatism of the two-sample t-test except for some special cases. (See Bugni et al. (2018, Remark 4.2) for more detail.) Our analysis follows this new framework, which departs from the literature of causal inference under an i.i.d. treatment structure.
Second, we contribute to the literature on causal inference under CARs by developing a new methodology that includes additional covariates in the estimation of the unconditional QTE and by establishing a general theory for regression adjustments that allow for parametric, nonparametric, and regularized estimations of the auxiliary regressions. As mentioned earlier, unlike ATE estimation, the naive linear quantile regression with additional covariates cannot even produce a consistent estimator of the QTE. Instead, we propose a new way to incorporate additional covariates based on the Neyman orthogonal moment and investigate the asymptotic properties and the efficiency gains of the proposed regression-adjusted estimator under CARs. This new machinery allows us to study the QTE regression, which is nonparametrically specified, with both linear (linear probability model) and nonlinear (logit and probit models) regression adjustments. To clarify this contribution to the literature we note that Hu and Hu (2012); Ma et al. (2015, 2020); Olivares (2021); Shao and Yu (2013); Zhang and Zheng (2020); Ye (2018); Ye and Shao (2020) considered inference of various causal parameters under CARs but without taking into account additional covariates. Bugni et al. (2018), Bugni et al. (2019), and Bugni and Gao (2021) considered saturated regressions for ATE and local ATE, which can be viewed as regression-adjustments where strata indicators are interacted with the treatment or instrument. Shao et al. (2010) showed that if a test statistic is constructed based on the correctly specified model between outcome and additional covariates and the covariates used for CAR are functions of additional covariates, then the test statistic is valid conditional on additional covariates. Bloniarz et al. (2016); Fogarty (2018); Lin (2013); Lu (2016); Lei and Ding (2021); Li and Ding (2020); Liu et al. (2020); Liu and Yang (2020); Negi and Wooldridge (2020); Ye et al. (2022); Zhao and Ding (2021) studied various estimation methods based on regression adjustments, but these studies all focused on ATE estimation. Specifically, Liu et al. (2020) considered linear adjustments for ATE under CARs in which the covariates can be high-dimensional and the adjustments can be estimated by Lasso. Ansel et al. (2018) considered regression adjustment using additional covariates for ATE and Local ATE. We differ from them by considering QTE with nonlinear adjustments such as logistic Lasso.
Third, we establish the validity of the multiplier bootstrap inference for the regression-adjusted QTE estimator under CARs. To the best of our knowledge, Shao et al. (2010) and Zhang and Zheng (2020) are the only works in the literature that studied bootstrap inference under CARs. Shao et al. (2010) considered the covariate-adaptive bootstrap for the linear regression model. Zhang and Zheng (2020) proposed to bootstrap inverse propensity score weighted (IPW) QTE estimator with the estimated target fraction of treatment even when the truth is known. They showed that the asymptotic variance of the IPW estimator is the same under various CARs. Thus, even though the bootstrap sample ignores the cross-sectional dependence and behaves as if the randomization scheme is simple, the asymptotic variance of the bootstrap analogue is still the same. We complement this research by studying the validity of multiplier bootstrap inference for our regression-adjusted QTE estimator. We establish analytically that the multiplier bootstrap with the estimated fraction of treatment is not conservative in the sense that it can achieve an asymptotic size equal to the nominal level under the null even when the auxiliary regressions are misspecified.
The present paper also comes under the umbrella of a growing literature that has addressed estimation and inference in randomized experiments. In this connection, we mention the studies of Hahn et al. (2011); Athey and Imbens (2017); Abadie et al. (2018); Tabord-Meehan (2021); Bai et al. (2021); Bai (2020); Jiang et al. (2021) among many others. Bai (2020) showed an ‘optimal’ matched-pair design can minimize the mean-squared error of the difference-in-means estimator for ATE, conditional on covariates. Tabord-Meehan (2021) designed an adaptive randomization procedure which can minimize the variance of the weighted estimator for ATE. Both works rely on a pilot experiment to design the optimal randomization. In contrast, we take the randomization scheme (i.e., CARs) as given and search for new estimators (other than difference-in-quantile and weighted estimators) for QTE that have smaller variance. In addition, our approach does not require a pilot experiment. Therefore, our and their methods are applied to different scenarios depending on the definition of ‘optimality’ and the data available, and thus, complement to each other.
From a practical perspective, our estimation and inferential methods have four advantages. First, they allow for common choices of auxiliary regressions such as linear probability, logit, and probit regressions, even though these regressions may be misspecified. Second, the methods can be implemented without tuning parameters. Third, our (bootstrap) estimator can be directly computed via the subgradient condition, and the auxiliary regressions need not be re-estimated in the bootstrap procedure, both of which save considerable computation time. Last, our estimation and inference methods can be implemented without the knowledge of the exact treatment assignment rule used in the experiment. This advantage is especially useful in subsample analysis, where sub-groups are defined using variables other than those to form the strata and the treatment assignment rule for each sub-group becomes unknown. See, for example, the anemic subsample analysis in Chong et al. (2016) and Zhang and Zheng (2020). These last three points carry over from Zhang and Zheng (2020) and are logically independent of the regression adjustments. One of our contributions is to show these results still hold for our regression-adjusted estimator.
The remainder of the paper is organized as follows. Section 2 describes the model setup and notation. Section 3 develops the asymptotic properties of our regression-adjusted QTE estimator. Section 4 studies the validity of the multiplier bootstrap inference. Section 5 considers parametric, nonparametric, and regularized estimation of the auxiliary regressions. Section 6 reports simulation results, and an empirical application of our methods to the impact of expanding access to basic bank accounts on savings is provided in Section 7. Section 8 concludes. Proofs of all results and some additional simulation results are given in the Online Supplement.
2 Setup and Notation
Potential outcomes for treated and control groups are denoted by and , respectively. Treatment status is denoted by , with indicating treated and untreated. The stratum indicator is denoted by , based on which the researcher implements the covariate-adaptive randomization. The support of is denoted by , a finite set. After randomization, the researcher can observe the data where , is the observed outcome, and contains extra covariates besides in the dataset. The support of is denoted . In this paper, we allow and to be dependent. For , let , , , and . We make the following assumptions on the data generating process (DGP) and the treatment assignment rule.
Assumption 1.
-
(i)
is i.i.d.
-
(ii)
.
-
(iii)
Suppose is fixed with respect to (w.r.t.) and is positive for every .
-
(iv)
Let denote the target fraction of treatment for stratum . Then, for some constant and for , where .
Several remarks are in order. First, Assumption 1(i) allows for cross-sectional dependence among treatment statuses (), thereby accommodating many covariate-adaptive randomization schemes as discussed below. Second, although treatment statuses are cross-sectionally dependent, they are independent of the potential outcomes and additional covariates conditional on the stratum indicator . Therefore, data are still experimental rather than observational. Third, Assumption 1(iii) requires the size of each stratum to be proportional to the sample size. Fourth, we can view as the target fraction of treated units in stratum . Similar to Bugni et al. (2019), we allow the target fractions to differ across strata. Just as for the overlapping support condition in an observational study, the target fractions are assumed to be bounded away from zero and one. In randomized experiments, this condition usually holds because investigators can determine in the design stage. In fact, in most CARs, for . Fifth, represents the degree of imbalance between the real and target factions of treated units in the th stratum. Bugni et al. (2018) show that Assumption 1(iv) holds under several covariate-adaptive treatment assignment rules such as simple random sampling (SRS), biased-coin design (BCD), adaptive biased-coin design (WEI), and stratified block randomization (SBR). For completeness, we briefly repeat their descriptions below. Note we only require , which is weaker than the assumption imposed by Bugni et al. (2018) but the same as that imposed by Bugni et al. (2019) and Zhang and Zheng (2020).
Example 1 (SRS).
Let be drawn independently across and of as Bernoulli random variables with success rate , i.e., for ,
Example 2 (WEI).
This design was first proposed by Wei (1978). Let , , and
where is a pre-specified non-increasing function satisfying and is understood to be zero.
Example 3 (BCD).
The treatment status is determined sequentially for as
where is defined as above and .
Example 4 (SBR).
For each stratum, units are assigned to treatment and the rest are assigned to control.
Denote the th quantile of by for . We are interested in estimating and inferring the th quantile treatment effect defined as . The testing problems of interest involve single, multiple, or even a continuum of quantile indices, as in the following null hypotheses
for some pre-specified value or function , where is some compact subset of . We can also test constant QTE by letting in the last hypothesis be a constant .
3 Estimation
Define for which are the true specifications but unknown to researchers. Instead, researchers specify working models 222We view as some function with inputs . For example, researchers can specify a linear probability model with , where is the linear coefficient that varies across treatment status and stratum . for the true specification, which can be misspecified. Last, the researchers estimate the (potentially misspecified) working models via some forms of regression, and the estimators are denoted as . We also refer to as the auxiliary regression.
Our regression-adjusted estimator of , denoted as , can be defined as
(3.1) |
where is the usual check function and . We emphasize that may not consistently estimate the true specification . Similarly, we can define
(3.2) |
Then, our regression adjusted QTE estimator is
(3.3) |
Several remarks are in order. First, in observational studies with i.i.d. data and , Firpo (2007), Belloni et al. (2017), and Kallus et al. (2020) showed that the doubly robust moment for is
(3.4) |
where and are the working models for the target fraction () and conditional probability (), respectively. Our estimator is motivated by this doubly robust moment, but our analysis differs from that for the observational data as CARs introduces cross-sectional dependence among observations. Second, as our target fraction estimator is consistent, it means is correctly specified as . Then, due to the double robustness, our regression adjusted estimator is consistent even when is misspecified and is an inconsistent estimator of . Third, we use the estimated target fraction even when is known because this guarantees that the bootstrap inference is not conservative. Further discussion is provided after Theorem 5.
Assumption 2.
For , denote , , and as the PDFs of , , and , respectively.
-
(i)
and are bounded and bounded away from zero uniformly over and , where is a compact subset of .
-
(ii)
and are Lipschitz over
-
(iii)
.
Assumption 3.
-
(i)
For , there exists a function such that for , we have
where .
-
(ii)
For , let with an envelope . Then, for and there exist fixed constants such that
where denotes the covering number, , and the supremum is taken over all finitely discrete probability measures .
-
(iii)
For and any , there exists a constant such that
Several remarks are in order. First, Assumption 2 is standard in the quantile regression literature. We do not need to be bounded away from zero because we are interested in the unconditional quantile , which is uniquely defined as long as the unconditional density is positive. Second, Assumption 3(i) is high-level. If we consider a linear probability model such that and , then Assumption 3(i) is equivalent to
which is similar to Liu et al. (2020, Assumption 3) and holds intuitively if is a consistent estimator of the pseudo true value . Third, Assumptions 3(ii) and 3(iii) impose mild regularity conditions on . Assumption 3(ii) holds automatically if is a finite set. In general, both Assumption 3(ii) and 3(iii) hold if
for some constant . Such Lipschitz continuity holds for the true specification () under Assumption 2. Fourth, we provide primitive sufficient conditions for Assumption 3 in Section 5.
Theorem 3.1.
Suppose Assumptions 1–3 hold. Then, uniformly over ,
where is a tight Gaussian process with covariance kernel defined in Section E of the Online Supplement. In addition, for any finite set of quantile indices , the asymptotic covariance matrix of is denoted as , where we use to denote a matrix whose th entry is . Then, is minimized in the matrix sense333For two symmetric matrices and , we say is greater than or equal to if is positive semidefinite. when the auxiliary regressions are correctly specified at , i.e., for , , and all in the joint support of .
Three remarks are in order. First, the expression for the asymptotic variance of can be found in the proof of Theorem 3. It is the same whether the randomization scheme achieves strong balance444We refer readers to Bugni et al. (2018) for the definition of strong balance. or not. This robustness is due to the use of the estimated target fraction (). The same phenomenon was discovered in the simplified setting by Zhang and Zheng (2020). Second, although our estimator is still consistent and asymptotically normal when the auxiliary regression is misspecified, it is meaningful to pursue the correct specification as it achieves the minimum variance. As the estimator with no adjustments can be viewed as a special case of our estimator with , Theorem 3 implies that the adjusted estimator with the correctly specified auxiliary regression is more efficient than that with no adjustments. If the auxiliary regression is misspecified, the adjusted estimator can sometimes be less efficient than the unadjusted one, which is known as the Freedman’s critique. In Section 5, we discuss how to make adjustments that do not harm the precision of the QTE estimator. Third, the asymptotic variance of depends on , which are infinite-dimensional nuisance parameters. To conduct analytic inference, it is necessary to nonparametrically estimate these nuisance parameters, which requires tuning parameters. Nonparametric estimation can be sensitive to the choice of tuning parameters and rule-of-thumb tuning parameter selection may not be appropriate for every DGP or every quantile. The use of cross-validation in selecting the tuning parameters is possible in principle but, in practice, time-consuming. These practical difficulties of analytic methods of inference provide strong motivation to investigate bootstrap inference procedures that are much less reliant on tuning parameters.
4 Multiplier Bootstrap Inference
We approximate the asymptotic distributions of via the multiplier bootstrap. Let be a sequence of bootstrap weights which will be specified later. Define , , , and . The multiplier bootstrap counterpart of is denoted by and defined as
where
(4.1) |
and
(4.2) |
Two comments on implementation are noted here: (i) we do not re-estimate in the bootstrap sample, which is similar to the multiplier bootstrap procedure proposed by Belloni et al. (2017); and (ii) in Section B of the Online Supplement we propose a way to directly compute from the subgradient conditions of (4.1) and (4.2), thereby avoiding the optimization. Both features considerably reduce computation time of our bootstrap procedure.
Next, we specify the bootstrap weights.
Assumption 4.
Suppose is a sequence of nonnegative i.i.d. random variables with unit expectation and variance and a sub-exponential upper tail.
Assumption 5.
Recall defined in Assumption 3. We have, for ,
We require the bootstrap weights to be nonnegative so that the objective functions in (4.1) and (4.2) are convex. In practice, we generate independently from the standard exponential distribution. Assumption 5 is the bootstrap counterpart of Assumption 3. Continuing with the linear model example considered after Assumption 3, Assumption 5 requires
which holds if is a uniformly consistent estimator of .
Theorem 4.1.
Suppose Assumptions 1–5 hold. Then, uniformly over ,
where is the same Gaussian process defined in Theorem 3.555We view and as two processes indexed by and denote them as and , respectively. Then, following van der Vaart and Wellner (1996, Chapter 2.9), we say uniformly over if where is the set of all functions such that for every , and denotes expectation with respect to the bootstrap weights .
Two remarks are in order. First, Theorem 5 shows the limit distribution of the bootstrap estimator conditional on data can approximate that of the original estimator uniformly over . This is the theoretical foundation for the bootstrap confidence intervals and bands described in Section B in the Online Supplement. Specifically, denote as the bootstrap estimates where is the number of bootstrap replications. Let and be the th empirical quantile of the sequence and the th standard normal critical value, respectively. Then, we suggest using the bootstrap estimator to construct the standard error of as . Note that, unlike Hahn and Liao (2021), our bootstrap standard error is not conservative. In our context, the bootstrap estimator of considered by Hahn and Liao (2021) is , where is the conditional expectation given data. It is well-known that weak convergence does not imply convergence in -norm, which explains why they can show their estimator is in general conservative. Instead, we use a different estimator of the standard error and can show it is consistent given weak convergence. Second, such a bootstrap approximation is consistent under CAR. Zhang and Zheng (2020) showed that for the QTE estimation without regression adjustment, bootstrapping the IPW QTE estimator with the estimated target fraction results in non-conservative inference, while bootstrapping the IPW estimator with the true fraction is conservative under CARs. As the estimator considered by Zhang and Zheng (2020) is a special case of our regression-adjusted estimator with , we conjecture that the same conclusion holds. A proof of conservative bootstrap inference with the true target fraction is not included in the paper due to the space limit.666Full statements and proofs are lengthy because we need to derive the limit distributions of not only the bootstrap but also the original estimator with the true target fraction. Although the negative result is theoretically interesting, we are not aware of any empirical papers using the true target fraction while making regression adjustments. Moreover, our method is shown to have better performance than the one with the true target fraction in simulations. So the practical value of proving the negative result is limited. Our simulations confirm both the correct size coverage of our inference method using the bootstrap with the estimated target fraction and the conservatism of the bootstrap with the true target fraction. The standard error of the QTE estimator is found to be 34.9% larger on average by using the true rather than the estimated target fraction in the simulations (see Tables 1 below and 15 in the Online Supplement).
5 Auxiliary Regressions
In this section, we consider two approaches to estimation for the auxiliary regressions: (1) a parametric method and (2) a nonparametric method. In Section A of the Online Supplement, we further consider a regularization method. For the parametric method, we do not require the model to be correctly specified. We propose ways to estimate the pseudo true value of the auxiliary regression. For the other two methods, we (nonparametrically) estimate the true model so that the asymptotic variance of achieves its minimum based on Theorem 3. For all three methods, we verify Assumptions 3 and 5.
5.1 Parametric method
In this section, we consider the case where is finite-dimensional. Recall for . We propose to model as , where is a finite dimensional parameter that depends on so that our model for is
(5.1) |
We note that, as we allow for misspecification, the researchers have the freedom to choose any functional forms for and any pseudo true values for , both of which can vary with respect to . For example, if we assume a logistic regression with , where is the logistic CDF, then there are various choices of such as the maximizer of the population pseudo likelihood, the maximizer of the population version of the least squares objective function, or the minimizer of the asymptotic variance of the adjusted QTE estimator. As the logistic model is potentially misspecified, these three pseudo true values are not necessarily the same and can lead to different adjustments, and thus, different asymptotic variances of the corresponding adjusted QTE estimators.
Next, we first state a general result for generic choices of and . Suppose we estimate by . Then, the corresponding can be written as
(5.2) |
Assumption 6.
-
(i)
Suppose there exist a positive random variable and a positive constant such that
-
(ii)
.
-
(iii)
.
Three remarks are in order. First, common choices for auxiliary regressions are linear probability, logistic, and probit regressions, corresponding to , , and , respectively, where is the standard normal CDF and . For these models, the functional form does not depend on , and Assumption 6(i) holds automatically. For the linear regression case, we do not include the intercept because our regression adjusted estimators ((3.1) and (3.2)) and their bootstrap counterparts ((4.1) and (4.2)) are numerically invariant to location shift of the auxiliary regressions. Second, it is also important to allow the functional form to vary across to incorporate the case in which the regressor in the linear, logistic, and probit regressions is replaced by , a function of that depends on . We give a concrete example for this situation in Section 5.1.3. Third, Assumption 6(ii) also holds automatically if is finite. When is infinite, this condition is still mild.
Theorem 5.1.
Denote and as the th QTE estimator and its multiplier bootstrap counterpart defined in Sections 3 and 4, respectively, with and defined in (5.1) and (5.2), respectively. Suppose Assumptions 1, 2, 4, and 6 hold. Then, Assumptions 3 and 5 hold, which further implies Theorems 3 and 5 hold for and , respectively.
Theorem 5.1 shows that, as long as the estimator of the pseudo true value () is uniformly consistent, under mild regularity conditions, all the general estimation and bootstrap inference results established in Sections 3 and 4 hold.
5.1.1 Linear probability model
In this section, we consider linear adjustment with parameter such that
(5.3) |
where the regressor is a function of but the functional form may vary across . For example, we can consider , the transformations of such as quadratic and interaction terms, and some prediction of given and . The last example is further explained in Section 5.1.3.
We note that the asymptotic variance (denoted as ) of the is a function of the working model (), which is further indexed by its parameters (denoted as ), i.e., . Our optimal linear adjustment corresponds to parameter value such that it minimizes , i.e.,
Assumption 7.
Define . There exist constants such that
and for some , where for a generic symmetric matrix , and denote the minimal and maximal eigenvalues of , respectively.
The next theorem derives the closed-form expression for the optimal linear coefficient.
Theorem 5.2.
Four remarks are in order. First, the optimal linear coefficients are not uniquely defined. In order to achieve the minimal variance, we only need to consistently estimate one of the minimizers. We choose
as this choice avoids estimation of the densities and . In Theorem 5.3 below, we propose estimators of and and show they are consistent uniformly over and . Second, note that no adjustment is nested by our linear adjustment with zero coefficients. Due to the optimality result established in Theorem 5.2, our regression-adjusted QTE estimator with (consistent estimators of) is more efficient than that with no adjustments. Third, we also need to clarify that the optimality of is only within the class of linear regressions. It is possible that the QTE estimator with some nonlinear adjustments are more efficient than that with the optimal linear adjustments, especially because the linear probability model is likely misspecified. Fourth, the optimal linear coefficients minimize (over the class of linear models) not only the asymptotic variance of but also the covariance matrix of for any finite-dimension quantile indices . This implies we can use the same (estimators of) optimal linear coefficients for hypothesis testing involving single, multiple, or even a continuum of quantile indices.
In the rest of this subsection, we focus on the estimation of . Note that is the projection coefficient of on for the sub-population with and . We estimate them by sample analog. Specifically, the parameter is unknown and is replaced by some -consistent estimator denoted by .
Assumption 8.
Assume that .
In practice, we compute based on (3.1) and (3.2) by setting . Then, Assumption 8 holds automatically by Theorem 3 with . Analysis throughout this section takes into account that the estimator is used in place of .
Next, we define the estimator of . Recall is defined in Assumption 3. Let
(5.4) | |||
(5.5) | |||
(5.6) |
and
(5.7) |
Assumption 9.
Suppose there exists a positive random variable and a positive constant such that for ,
and for some .
We note that Assumption 9 holds automatically if the regressor does not depend on .
We refer to the QTE estimator adjusted by this linear probability model with optimal linear coefficients and estimators as the LP estimator and denote it and its bootstrap counterpart as and , respectively. Theorem 5.3 verifies Assumption 6 for the proposed estimator of the optimal linear coefficient. Then, by Theorem 5.1, Theorems 3 and 5 hold for and , which implies all the estimation and inference methods established in the paper are valid for the LP estimator. Theorem 5.2 further shows is the estimator with the optimal linear adjustment and weakly more efficient than the QTE estimator with no adjustments.
5.1.2 Logistic probability model
It is also common to consider the logistic regression as the adjustment and estimate the model by maximum likelihood (ML). The main goal of the working model is to approximate the true model as closely as possible. It is, therefore, useful to include additional technical regressors such as interactions in the logistic regression. The set of regressors used is defined as , which is allowed to contain the intercept. Let and be the quasi-ML estimator and its corresponding pseudo true value, respectively, i.e.,
(5.8) |
and
(5.9) |
We then define
(5.10) |
In addition to the inclusion of technical regressors, we allow the pseudo true value () to vary across quantiles , giving another layer of flexibility to the model. Such a model is called the distribution regression and was first proposed by Chernozhukov et al. (2013). We emphasize here that, although we aim to make the regression model as flexible as possible, our theory and results do not require the model to be correctly specified.
Assumption 10.
Suppose is the unique minimizer defined in (5.9) for .
Theorem 5.4.
Four remarks are in order. First, we refer to the QTE estimator adjusted by the logistic model with QMLE as the ML estimator and denote it and its bootstrap counterpart as and , respectively. Assumptions 6(i) holds automatically for the logistic regression. If we further impose Assumption 6(ii), then Theorem 5.4 implies that all the estimation and bootstrap inference methods established in the paper are valid for the ML estimator. Second, we take into account that is computed when the true is replaced by its estimator and derive the results in Theorem 5.4 under Assumption 8. Third, the ML estimator is not guaranteed to be optimal or be more efficient than QTE estimator with no adjustments. On the other hand, as we can include additional technical terms in the regression and allow the regression coefficients to vary across , the logistic model can be close to the true model , which achieves the global minimum asymptotic variance based on Theorem 3. Fourth, in Section 5.2, we further justify the use of the ML estimator with a flexible logistic model by letting the number of technical terms (or equivalently, the dimension of ) diverge to infinity, showing by this means that the ML estimator can indeed consistently estimate the true model and thereby achieve the global minimum covariance matrix of the adjusted QTE estimator.
5.1.3 Further improved logistic model
Although in simulations, we cannot find a DGP in which the QTE estimator with logistic adjustment is less efficient than that with no adjustments, theoretically such a scenario still exists. In this section, we follow the idea of Cohen and Fogarty (2020) and construct an estimator which is weakly more efficient than both the ML estimator and the estimator with no adjustments. We denote and treat it as the regressor in a linear adjustment, i.e., define . Then, the logistic adjustment in Section 5.1.2 and no adjustments correspond to and for , respectively. However, following Theorem 5.2, the optimal linear coefficient with regressor is
(5.11) |
where . Using the adjustment term with is asymptotically weakly more efficient than any other choices of . In practice, we do not observe , but can replace it by its feasible version . We then define
(5.12) | |||
(5.13) | |||
(5.14) |
and
(5.15) |
Assumption 11.
-
(i)
There exist constants such that
-
(ii)
Suppose
Theorem 5.5.
Denote and as the th QTE estimator and its multiplier bootstrap counterpart defined in Sections 3 and 4, respectively, with and defined in (5.12) and (5.13), respectively. Suppose Assumptions 1, 2, 8, 10, and 11 hold, and there exist constants such that
Then, Assumptions 3 and 5 hold, which further implies Theorems 3 and 5 hold for and , respectively. Further denote the asymptotic covariance matrices of for any finite set of quantile indices as for , where is the th QTE estimator without adjustments. Then we have
in the matrix sense.
In practice when is small may be nearly multicollinear within some stratum, which can lead to size distortion in inference concerning QTE. We therefore suggest first normalizing by its standard deviation (denoting the normalized as ) and then running a ridge regression
where is the two-dimensional identity matrix and . Then, the final regression adjustment is
Given Assumption 11, such a ridge penalty is asymptotically negligible and all the results in Theorem 5.5 still hold.777In unreported simulations, we find that when , the ridge regularization is unnecessary and the original adjustment (i.e., (5.13)) has no size distortion, implying that near-multicollinearity is indeed just a finite-sample issue.
5.2 Nonparametric method
This section considers nonparametric estimation of when the dimension of is fixed as . For ease of notation, we assume all coordinates of are continuously distributed. If in an application some elements of are discrete, the dimension is interpreted as the dimension of the continuous covariates. All results in this section can then be extended in a conceptually straightforward manner by using the continuous covariates only within samples that are homogeneous in discrete covariates.
As is nonparametrically estimated, we have . We estimate by the sieve method of fitting a logistic model, as studied by Hirano et al. (2003). Specifically, recall is the logistic CDF and denote the number of sieve bases by , which depends on the sample size and can grow to infinity as . Let where is an dimensional basis of a linear sieve space. More details on the sieve space are given in Section B of the Online Supplement. Denote
(5.16) | |||
(5.17) |
We refer to the QTE estimator with the nonparametric adjustment as the NP estimator. Note that we use the estimator of in (5.17), where satisfies Assumption 8. All the analysis in this section takes account of the fact that instead of is used.
Assumption 12.
-
(i)
There exist constants such that with probability approaching one,
and
-
(ii)
For , there exists an vector such that for , we have ,
and
-
(iii)
For , there exists a constant such that
-
(iv)
Suppose for some constant , , , and , where denotes the th coordinate of .
Four remarks are in order. First, Assumption 12(i) is standard in the sieve literature. Second, Assumption 12(ii) means the approximation error of the sieve logistic model vanishes asymptotically, which holds given sufficient smoothness of in . Third, Assumption 12(iii) usually holds when is compact. This condition is also assumed by Hirano et al. (2003). Fourth, the quantity in Assumption 12(iv) depends on the choice of basis functions. For example, for splines and for power series. Taking splines as an example, Assumption 12(iv) requires .
Theorem 5.6.
Denote and as the th QTE estimator and its multiplier bootstrap counterpart defined in Sections 3 and 4, respectively, with and defined in (5.16). Further suppose Assumptions 1, 2, 4, 8, and 12 hold. Then, Assumptions 3 and 5 hold, which further implies that Theorems 3 and 5 hold for and , respectively. In addition, for any finite-dimensional quantile indices , the covariance matrix of achieves the minimum (in the matrix sense) as characterized in Theorem 3.
Three remarks are in order. First, as the nonparametric regression consistently estimates the true specifications , the QTE estimator adjusted by the nonparametric regression achieves the global minimum asymptotic variance, and thus is weakly more efficient than QTE estimation with linear and logistic adjustments studied in the previous section. Second, the practical implementation of NP and ML methods are the same, given that they share the same set of covariates (basis functions). Therefore, even if we include a small number of basis functions so that is better treated as fixed, the proposed estimation and inference methods for the regression-adjusted QTE estimator are still valid, although they may not be optimal. Third, in Section A in the Online Supplement, we consider computing via an penalized logistic regression when the dimension of the regressors can be comparable or even higher than the sample size. We then provide primitive conditions under which we verify Assumptions 3 and 5.
6 Simulations
6.1 Data generating processes
Two DGPs are used to assess the finite sample performance of the estimation and inference methods introduced in the paper. We consider the outcome equation
(6.1) |
where for all cases while , , and are separately specified as follows.
-
(i)
Let be standardized Beta distributed, , and . contains two covariates , where follows a uniform distribution on , follows a standard normal distribution, and and are independent. Further define , , , and , where are jointly standard normal.
-
(ii)
Let be uniformly distributed on , , and . Let be the same as defined in DGP (i). Further define , with , and , where are mutually independently distributed.
For each DGP, we consider the following four randomization schemes as in Zhang and Zheng (2020) with for :
-
(i)
SRS: Treatment assignment is generated as in Example 1.
-
(ii)
WEI: Treatment assignment is generated as in Example 2 with .
-
(iii)
BCD: Treatment assignment is generated as in Example 3 with .
-
(iv)
SBR: Treatment assignment is generated as in Example 4.
We assess the empirical size and power of the tests for and . We compute the true QTEs and their differences by simulations with 10,000 sample size and 1,000 replications. To compute power, we perturb the true values by . We examine three null hypotheses:
-
(i)
Pointwise test
-
(ii)
Test for the difference
-
(iii)
Uniform test
For the pointwise test, we report the results for the median () in the main text and give the cases and in the Online Supplement.
6.2 Estimation methods
We consider the following estimation methods of the auxiliary regression.
-
(i)
NA: the estimator with no adjustments, i.e., setting .
-
(ii)
LP: the linear probability model with regressors and the pseudo true value estimated by defined in (5.7).
-
(iii)
ML: the logistic model with regressor and the pseudo true value estimated by defined in (5.8).
-
(iv)
LPML: the logistic model with regressor and the pseudo true value estimated by defined in (5.15).
-
(v)
MLX: the logistic model with regressor and the pseudo true value estimated by defined in (5.8).
-
(vi)
LPMLX: the logistic model with regressor and the pseudo true value estimated by defined in (5.15).
-
(vii)
NP: the logistic model with regressor where and are the sample medians of and , respectively. The pseudo true value is estimated by defined in (5.17).
6.3 Simulation results
Table 1 presents the empirical size and power for the pointwise test with under DGPs (i) and (ii). We make six observations. First, none of the auxiliary regressions is correctly specified, but test sizes are all close to the nominal level 5%, confirming that estimation and inference are robust to misspecification. Second, the inclusion of auxiliary regressions improves the efficiency of the QTE estimator as the powers for method “NA” are the lowest among all the methods for both DGPs and all randomization schemes. This finding is consistent with theory because methods “LP”, “LPML”, “LPMLX”, “NP” are guaranteed to be weakly more efficient than “NA”. Third, the powers of methods “LPML” and “LPMLX” are higher than those of methods “ML” and “MLX”, respectively. This is consistent with our theory that methods “LPML” and “LPMLX” further improve “ML” and “MLX”, respectively. In addition, methods “MLX” and “LPMLX” fit a flexible distribution regression that can approximate the true DGP well. Therefore, the powers of “MLX” and “LPMLX” are respectively much larger than those of “ML” and “LPML”. For the same reason we observe that the power of “LPMLX” is close to “NP”.888The results in Section C of the Online Supplement show that “LPMLX” has much smaller bias than “NP” and its variance is similar to “NP”, which make “LPMLX” preferable in practice. Fourth, the powers of method “NP” are the best because it estimates the true specification and achieves the minimum asymptotic variance of as shown in Theorem 5.1. Fifth, when the sample size is 200, the method “NP” slightly over-rejects but size becomes closer to nominal when the sample size increases to 400. Sixth, the improvement of power of “LPMLX” estimator upon “NA” (i.e., with no adjustments) is due to about 12-15% reduction of the standard error of the QTE estimator on average.999The bias and standard errors are reported in the Section C in the Online Supplement.
Size | Power | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Methods | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR |
Panel A: DGP (i) | ||||||||||||||||
NA | 0.055 | 0.054 | 0.050 | 0.054 | 0.051 | 0.054 | 0.051 | 0.051 | 0.404 | 0.406 | 0.403 | 0.406 | 0.665 | 0.676 | 0.681 | 0.681 |
LP | 0.052 | 0.050 | 0.049 | 0.052 | 0.048 | 0.053 | 0.051 | 0.052 | 0.491 | 0.497 | 0.502 | 0.492 | 0.779 | 0.788 | 0.790 | 0.791 |
ML | 0.053 | 0.050 | 0.049 | 0.055 | 0.051 | 0.050 | 0.052 | 0.052 | 0.472 | 0.478 | 0.483 | 0.473 | 0.759 | 0.768 | 0.775 | 0.773 |
LPML | 0.054 | 0.052 | 0.052 | 0.057 | 0.052 | 0.054 | 0.051 | 0.053 | 0.506 | 0.509 | 0.523 | 0.513 | 0.802 | 0.812 | 0.814 | 0.809 |
MLX | 0.056 | 0.059 | 0.055 | 0.057 | 0.055 | 0.054 | 0.055 | 0.058 | 0.475 | 0.479 | 0.486 | 0.482 | 0.752 | 0.759 | 0.760 | 0.760 |
LPMLX | 0.060 | 0.058 | 0.059 | 0.058 | 0.054 | 0.055 | 0.054 | 0.054 | 0.506 | 0.513 | 0.521 | 0.512 | 0.802 | 0.810 | 0.813 | 0.811 |
NP | 0.063 | 0.059 | 0.062 | 0.064 | 0.055 | 0.054 | 0.054 | 0.056 | 0.523 | 0.523 | 0.531 | 0.526 | 0.804 | 0.811 | 0.814 | 0.809 |
Panel B: DGP (ii) | ||||||||||||||||
NA | 0.046 | 0.051 | 0.045 | 0.047 | 0.047 | 0.045 | 0.048 | 0.047 | 0.479 | 0.489 | 0.500 | 0.490 | 0.773 | 0.775 | 0.774 | 0.782 |
LP | 0.049 | 0.051 | 0.050 | 0.050 | 0.045 | 0.048 | 0.050 | 0.045 | 0.572 | 0.581 | 0.589 | 0.579 | 0.851 | 0.856 | 0.857 | 0.854 |
ML | 0.051 | 0.058 | 0.050 | 0.054 | 0.049 | 0.046 | 0.050 | 0.048 | 0.524 | 0.534 | 0.541 | 0.539 | 0.812 | 0.810 | 0.807 | 0.807 |
LPML | 0.051 | 0.058 | 0.054 | 0.053 | 0.050 | 0.049 | 0.053 | 0.047 | 0.574 | 0.581 | 0.588 | 0.580 | 0.862 | 0.863 | 0.863 | 0.863 |
MLX | 0.058 | 0.059 | 0.056 | 0.059 | 0.051 | 0.049 | 0.051 | 0.050 | 0.566 | 0.574 | 0.583 | 0.573 | 0.826 | 0.824 | 0.827 | 0.827 |
LPMLX | 0.057 | 0.062 | 0.057 | 0.060 | 0.052 | 0.050 | 0.053 | 0.052 | 0.615 | 0.620 | 0.630 | 0.627 | 0.878 | 0.878 | 0.880 | 0.879 |
NP | 0.063 | 0.066 | 0.062 | 0.062 | 0.056 | 0.055 | 0.056 | 0.051 | 0.622 | 0.625 | 0.632 | 0.628 | 0.883 | 0.880 | 0.882 | 0.879 |
Tables 2 and 3 present sizes and powers of inference on and on uniformly over , respectively, for DGPs (i) and (ii) and four randomization schemes. All the observations made above apply to these results. The improvement in power of the “LPMLX” estimator upon “NA” (i.e., with no adjustments) is due to a 9% reduction of the standard error of the difference of the QTE estimators on average. In Section C in the Online Supplement, we provide additional simulation results such as the empirical sizes and powers for the pointwise test with and , the bootstrap inference with the true target fraction, and the adjusted QTE estimator when the DGP contains high-dimensional covariates and the adjustments are computed via logistic Lasso. We also report the biases and standard errors of the adjusted QTE estimators.
Size | Power | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Methods | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR |
Panel A: DGP (i) | ||||||||||||||||
NA | 0.043 | 0.045 | 0.040 | 0.041 | 0.044 | 0.043 | 0.041 | 0.043 | 0.214 | 0.216 | 0.209 | 0.203 | 0.387 | 0.389 | 0.383 | 0.365 |
LP | 0.045 | 0.048 | 0.043 | 0.045 | 0.045 | 0.047 | 0.043 | 0.045 | 0.246 | 0.242 | 0.234 | 0.248 | 0.424 | 0.422 | 0.422 | 0.421 |
ML | 0.045 | 0.045 | 0.043 | 0.042 | 0.046 | 0.047 | 0.040 | 0.048 | 0.234 | 0.233 | 0.231 | 0.239 | 0.415 | 0.422 | 0.417 | 0.426 |
LPML | 0.044 | 0.049 | 0.045 | 0.045 | 0.049 | 0.049 | 0.044 | 0.047 | 0.250 | 0.250 | 0.248 | 0.259 | 0.451 | 0.453 | 0.450 | 0.459 |
MLX | 0.046 | 0.052 | 0.046 | 0.047 | 0.047 | 0.047 | 0.044 | 0.049 | 0.232 | 0.234 | 0.229 | 0.241 | 0.415 | 0.415 | 0.404 | 0.416 |
LPMLX | 0.049 | 0.055 | 0.047 | 0.047 | 0.049 | 0.050 | 0.047 | 0.047 | 0.247 | 0.249 | 0.249 | 0.258 | 0.445 | 0.453 | 0.445 | 0.453 |
NP | 0.050 | 0.054 | 0.050 | 0.051 | 0.052 | 0.052 | 0.047 | 0.048 | 0.246 | 0.248 | 0.245 | 0.257 | 0.444 | 0.444 | 0.442 | 0.450 |
Panel B: DGP (ii) | ||||||||||||||||
NA | 0.039 | 0.044 | 0.040 | 0.038 | 0.044 | 0.041 | 0.039 | 0.047 | 0.211 | 0.225 | 0.217 | 0.194 | 0.399 | 0.396 | 0.392 | 0.383 |
LP | 0.043 | 0.048 | 0.045 | 0.040 | 0.045 | 0.044 | 0.042 | 0.047 | 0.244 | 0.255 | 0.251 | 0.245 | 0.447 | 0.440 | 0.441 | 0.455 |
ML | 0.049 | 0.046 | 0.046 | 0.043 | 0.044 | 0.045 | 0.042 | 0.048 | 0.217 | 0.228 | 0.213 | 0.212 | 0.379 | 0.386 | 0.386 | 0.396 |
LPML | 0.047 | 0.051 | 0.048 | 0.043 | 0.047 | 0.045 | 0.047 | 0.048 | 0.253 | 0.258 | 0.253 | 0.252 | 0.456 | 0.451 | 0.454 | 0.468 |
MLX | 0.047 | 0.051 | 0.047 | 0.047 | 0.046 | 0.046 | 0.045 | 0.049 | 0.226 | 0.240 | 0.228 | 0.223 | 0.394 | 0.392 | 0.391 | 0.399 |
LPMLX | 0.053 | 0.056 | 0.051 | 0.048 | 0.051 | 0.049 | 0.045 | 0.050 | 0.261 | 0.272 | 0.265 | 0.263 | 0.467 | 0.460 | 0.460 | 0.477 |
NP | 0.056 | 0.058 | 0.053 | 0.052 | 0.051 | 0.052 | 0.045 | 0.050 | 0.266 | 0.275 | 0.266 | 0.270 | 0.469 | 0.459 | 0.461 | 0.479 |
Size | Power | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Methods | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR |
Panel A: DGP (i) | ||||||||||||||||
NA | 0.048 | 0.044 | 0.044 | 0.045 | 0.047 | 0.049 | 0.045 | 0.048 | 0.450 | 0.451 | 0.455 | 0.454 | 0.765 | 0.770 | 0.769 | 0.770 |
LP | 0.045 | 0.044 | 0.043 | 0.045 | 0.047 | 0.051 | 0.047 | 0.046 | 0.589 | 0.588 | 0.589 | 0.581 | 0.902 | 0.901 | 0.904 | 0.900 |
ML | 0.047 | 0.044 | 0.043 | 0.045 | 0.044 | 0.051 | 0.045 | 0.047 | 0.570 | 0.577 | 0.582 | 0.568 | 0.887 | 0.889 | 0.893 | 0.890 |
LPML | 0.046 | 0.046 | 0.045 | 0.047 | 0.046 | 0.050 | 0.046 | 0.051 | 0.603 | 0.605 | 0.616 | 0.607 | 0.916 | 0.917 | 0.915 | 0.915 |
MLX | 0.052 | 0.049 | 0.048 | 0.048 | 0.046 | 0.053 | 0.050 | 0.050 | 0.582 | 0.582 | 0.595 | 0.576 | 0.889 | 0.893 | 0.891 | 0.889 |
LPMLX | 0.053 | 0.047 | 0.049 | 0.052 | 0.047 | 0.053 | 0.050 | 0.050 | 0.612 | 0.614 | 0.619 | 0.610 | 0.915 | 0.919 | 0.919 | 0.913 |
NP | 0.056 | 0.055 | 0.054 | 0.055 | 0.050 | 0.057 | 0.052 | 0.054 | 0.633 | 0.627 | 0.633 | 0.629 | 0.916 | 0.919 | 0.918 | 0.915 |
Panel B: DGP (ii) | ||||||||||||||||
NA | 0.038 | 0.039 | 0.039 | 0.038 | 0.045 | 0.039 | 0.040 | 0.045 | 0.572 | 0.571 | 0.579 | 0.574 | 0.878 | 0.882 | 0.879 | 0.879 |
LP | 0.041 | 0.044 | 0.045 | 0.041 | 0.044 | 0.043 | 0.039 | 0.042 | 0.704 | 0.708 | 0.710 | 0.700 | 0.953 | 0.955 | 0.956 | 0.955 |
ML | 0.044 | 0.043 | 0.048 | 0.041 | 0.047 | 0.045 | 0.043 | 0.044 | 0.661 | 0.660 | 0.664 | 0.655 | 0.931 | 0.931 | 0.933 | 0.935 |
LPML | 0.047 | 0.046 | 0.048 | 0.044 | 0.047 | 0.046 | 0.041 | 0.046 | 0.723 | 0.714 | 0.720 | 0.714 | 0.964 | 0.963 | 0.965 | 0.964 |
MLX | 0.052 | 0.050 | 0.052 | 0.049 | 0.048 | 0.046 | 0.045 | 0.045 | 0.703 | 0.710 | 0.708 | 0.704 | 0.946 | 0.949 | 0.946 | 0.951 |
LPMLX | 0.056 | 0.054 | 0.054 | 0.051 | 0.052 | 0.048 | 0.046 | 0.048 | 0.761 | 0.761 | 0.766 | 0.754 | 0.972 | 0.972 | 0.972 | 0.974 |
NP | 0.060 | 0.060 | 0.062 | 0.058 | 0.055 | 0.052 | 0.047 | 0.051 | 0.770 | 0.771 | 0.773 | 0.765 | 0.973 | 0.974 | 0.972 | 0.974 |
6.4 Practical recommendations
When is finite-dimensional, we suggest using the LPMLX adjustment in which the logistic model includes interaction terms and the regression coefficients are allowed to depend on . When is high-dimensional, we suggest using the logistic Lasso to estimate the regression adjustment.101010The relevant theory and simulation results on high-dimensional covariates are provided in Section A of the Online Supplement.
7 Empirical Application
Undersaving has been found to have important individual and social welfare consequences (Karlan et al., 2014). Does expanding access to bank accounts for the poor lead to an overall increase in savings? To answer the question, Dupas et al. (2018) conducted a covariate-adaptive randomized experiment in Uganda, Malawi, and Chile to study the impact of a bank account subsidy on savings. In their paper, the authors examined the ATEs as well as the QTEs of the subsidy. This section reports an application of our methods to the same dataset to examine the QTEs of the subsidy on household total savings in Uganda.
The sample consists of 2160 households in Uganda.111111We filter out observations with missing values. Our final sample contains 1952 households. Within each of 41 strata by gender, occupation, and bank branch, 50 percent of households in the sample were randomly assigned to receive the bank account subsidy and the rest of the sample were in the control group. This is a block stratified randomization design with 41 strata, which satisfies Assumption 1 in Section 2. The target fraction of the treated units is 1/2. It is trivial to see that statements (i), (ii), and (iii) in Assumption 1 are satisfied. Because , it is reasonable to claim that Assumption 1(iv) is also satisfied in our analysis.
After the randomization and the intervention, the authors conducted 3 rounds of follow-up surveys in Uganda (See Dupas et al. (2018) for a detailed description). In this section, we focus on the first round follow up survey to examine the impact of the bank account subsidy on total savings.
Tables 4 and 5 present the QTE estimates and their standard errors (in parentheses) estimated by different methods at quantile indices 0.25, 0.5, and 0.75. The description of these estimators is similar to that in Section 6.121212Specifically, we have: (i) NA: the estimator with no adjustments. (ii) LP: the linear probability model. When there is only one auxiliary regressor, , and when there are four auxiliary regressors, , where represent four covariates used in the regression adjustment. (iii) ML: the logistic probability model with regressor , where is the same as that in the LP model. (iv) LPML: the further improved logistic probability model with regressor , where is the same as that in the LP model. (v) MLX: the logistic probability model with interaction terms. MLX is only be applied to the case when there are four auxiliary regressors with . (vi) LPMLX: the further improved logistic probability model with interaction terms. LPMLX is only be applied to the case when there are four auxiliary regressors with the same as that used in the MLX model. (vii) NP: the nonparametric logistic probability model with regressor . NP is only applied to the case when there are four auxiliary regressors with where and are the sample medians of and , respectively. (viii) Lasso: the logistic probability model with regressor and post-Lasso coefficient estimator . Lasso is only applied to the case when there are four auxiliary regressors with . The post-Lasso estimator is defined in (A.2). The choice of tuning parameter and the estimation procedure are detailed in Section B.3. In the analysis, we focus on two sets of additional baseline variables: baseline value of total savings only (one auxiliary regressor) and baseline value of total savings, household size, age, and married female dummy (four auxiliary regressors). The first set of regressors follows Dupas et al. (2018). The second one is used to illustrate all the methods discussed in the paper. Tables 4 and 5 report the results with one and four auxiliary regressors, respectively.
NA | LP | ML | LPML | |
---|---|---|---|---|
25% | 1.105 | 1.105 | 1.105 | 1.105 |
(0.564) | (0.564) | (0.470) | (0.470) | |
50% | 3.682 | 3.682 | 3.682 | 3.682 |
(1.010) | (1.080) | (1.146) | (1.033) | |
75% | 7.363 | 9.204 | 9.204 | 9.204 |
(3.757) | (4.227) | (3.616) | (3.757) |
Notes: The table presents the QTE estimates of the effect of the bank account subsidy on household total savings at quantiles 25%, 50%, and 75% when only one auxiliary regressor (baseline value of total savings) is used in the regression adjustment models. Standard errors are in parentheses.
NA | LP | ML | LPML | MLX | LPMLX | NP | Lasso | |
---|---|---|---|---|---|---|---|---|
25% | 1.105 | 1.473 | 1.105 | 1.105 | 1.105 | 1.105 | 1.105 | 1.105 |
(0.564) | (0.564) | (0.564) | (0.564) | (0.357) | (0.319) | (0.188) | (0.564) | |
50% | 3.682 | 3.682 | 3.682 | 3.682 | 3.682 | 3.682 | 3.682 | 3.682 |
(1.010) | (1.033) | (0.939) | (0.939) | (0.958) | (1.033) | (0.939) | (0.939) | |
75% | 7.363 | 8.100 | 7.363 | 7.363 | 7.363 | 7.363 | 7.363 | 7.363 |
(3.757) | (3.757) | (3.757) | (3.569) | (3.757) | (3.663) | (3.663) | (3.757) |
Notes: The table shows QTE estimates of the effect of the bank account subsidy on household total savings at quantiles 25%, 50%, and 75% when four auxiliary regressors (baseline value of total savings, household size, age, and married female dummy) are used in the regression adjustment models. Standard errors are in parentheses.
NA | LP | ML | LPML | MLX | LPMLX | NP | Lasso | |
---|---|---|---|---|---|---|---|---|
2.577 | 2.209 | 2.577 | 2.577 | 2.577 | 2.577 | 2.577 | 2.577 | |
(0.939) | (1.104) | (0.939) | (0.939) | (0.958) | (1.033) | (0.845) | (0.911) | |
3.682 | 4.418 | 3.682 | 3.682 | 3.682 | 3.682 | 3.682 | 3.682 | |
(3.757) | (3.663) | (3.663) | (3.287) | (3.475) | (3.287) | (3.663) | (3.757) | |
6.259 | 6.627 | 6.259 | 6.259 | 6.259 | 6.259 | 6.259 | 6.259 | |
(3.851) | (3.757) | (3.757) | (3.695) | (3.588) | (3.569) | (3.287) | (3.832) |
Notes: The table presents tests for the difference between two QTE estimates of the effect of the bank account subsidy on household total savings when there are four auxiliary regressors: baseline value of total savings, household size, age, and married female dummy. Standard errors are in parentheses.

Notes: The graphs in each panel of the figure plot the QTE estimates of the effect of the bank account subsidy on the distribution of household total savings when there are four auxiliary regressors: baseline value of total savings, household size, age, and married female dummy. The shadowed areas display 95% confidence regions.
The results in Tables 4-5 prompt two observations. First, consistent with the theoretical and simulation results, the standard errors for the regression-adjusted QTEs are mostly lower than those for the QTE estimate without adjustment. This observation holds for most of the specification and estimation methods of the auxiliary regressions.131313The efficiency gain from the “NP” adjustment is not the only reason for its small standard error at the 25% QTE. Another reason is that the treated outcomes around this percentile themselves do not have much variation. For example, in Table 4, the standard errors for the “LPML” QTE estimates are 16.7% less than those for the QTE estimate without adjustment at the 25th percentile, respectively. For another example in Table 5, at the 25th percentile, the standard error for the “LPMLX” QTE estimates is 43.4% less than that for the QTE estimate without adjustment. At the median, the standard error for the “LPML” QTE estimates is 7% less than that for the QTE estimate without adjustment.
Second, there is substantial heterogeneity in the impact of the subsidy on total savings. In particular, we observe larger effects as the quantile indexes increase, which is consistent with the findings in Dupas et al. (2018). For example, Table 5 shows that, although the treatment effects are all positive and significantly different from zero at quantiles 25%, 50%, and 75%, the magnitude of the effects increases by over 200% from the 25th percentile to the median and by around 100% from the median to the 75th percentile.
The second observation suggests that the heterogeneous effects of the subsidy on savings are sizable economically. To evaluate whether these effects are statistically significant, we report statistical tests for the heterogeneity of the QTEs in Table 6. Specifically, we test the null hypotheses: , , and . Table 6 shows that only the difference between the 50th and 25th QTEs is statistically significant at the 5% significance level.
How does the variation in the impact of the subsidy appear across the distribution of total savings? The QTEs on the distribution of savings are plotted in Figure 1, where the shaded areas represent the 95% confidence region. The figure shows that the QTEs seem insignificantly different from zero below about the 20th percentile. At higher levels to near the 80th percentile, the treatment group savings exceed the control group savings at an accelerated rate, yielding increasingly significantly positive QTEs. Beyond the 80th percentile, the QTEs again become insignificantly different from zero. These findings point to notable distributional heterogeneity in the impact of the subsidy on savings.
8 Conclusion
This paper proposes the use of auxiliary regression to incorporate additional covariates into estimation and inference relating to unconditional QTEs under CARs. The auxiliary regression model may be estimated parametrically, nonparametrically, or via regularization if there are high-dimensional covariates. Both estimation and bootstrap inference methods are robust to potential misspecification of the auxiliary model and do not suffer from the conservatism due to the CAR. It is shown that efficiency can be improved when including extra covariates. When the auxiliary regression is correctly specified, the regression-adjusted estimator further achieves the minimum asymptotic variance. In both the simulations and the empirical application, the proposed regression-adjusted QTE estimator performs well. These results and the robustness of the methods to auxiliary model misspecification reflect the aphorism widespread in scientific modeling that all models may be wrong, but some are useful.141414The aphorism “all models are wrong, but some are useful” is often attributed to the statistician George Box (1976). But the notion has many antecedents, including a particularly apposite remark made in 1947 by John von Neumann (2019) in an essay on the empirical origins of mathematical ideas to the effect that “truth … is much too complicated to allow anything but approximations”.
Acknowledgements
We thank the Managing Editor, Elie Tamer, the Associate Editor and three anonymous referees for many useful comments that helped to improve this paper. We are also grateful to Michael Qingliang Fan and seminar participants from the 2022 Econometric Society Australasian Meeting, the 2021 Nanyang Econometrics Workshop, University of California, Irvine, and University of Sydney for their comments.
Funding:
Yichong Zhang acknowledges financial support from the Singapore Ministry of Education under Tier 2 grant No. MOE2018-T2-2-169, the NSFC under the grant No. 72133002, and a Lee Kong Chian fellowship. Peter C. B. Phillips acknowledges support from NSF Grant No. SES 18-50860, a Kelly Fellowship at the University of Auckland, and a Lee Kong Chian Fellowship. Yubo Tao acknowledges the financial support from the Start-up Research Grant of University of Macau (SRG2022-00016-FSS). Liang Jiang acknowledges support from MOE (Ministry of Education in China) Project of Humanities and Social Sciences (Project No.18YJC790063).
Appendix A Regularization Method for Regression Adjustments
This section considers estimation of in a high-dimensional environment. Let be the regressors with dimension , which may exceed the sample size. When the number of raw controls is comparable to or exceeds the sample size, we can just let . On the other hand, may be composed of a large dictionary of sieve bases derived from a fixed dimensional vector through suitable transformations such as powers and interactions. Thus, high dimensionality in can arise from the desire to flexibly approximate nuisance functions. In our approach we follow Belloni et al. (2017) and implement a logistic regression with -penalization. In their notation we view as a function of , i.e., , where . We estimate as , where is defined in Assumption 8,
(A.1) |
is a tuning parameter, and is a diagonal matrix of data-dependent penalty loadings. We specify and in Section B. Post-Lasso estimation is also considered. Let be the support of , where is the th coordinate of . We can complement with additional variables in that researchers want to control for and define the enlarged set of variables as . We compute the post-Lasso estimator as
(A.2) |
Finally, we compute the auxiliary model as
(A.3) |
We refer to the QTE estimator with the regularized adjustment as the HD estimator. Note that we use the estimator of in (A.3), where satisfies Assumption 8. All the analysis in this section takes account of the fact that instead of is used.
Assumption 13.
-
(i)
Let for . Suppose such that .
-
(ii)
Suppose and for .
-
(iii)
Suppose
and
-
(iv)
, , , where denotes the number of elements in .
-
(v)
There exists a constant such that
-
(vi)
Let be a sequence that diverges to infinity. Then, there exist two constants and such that with probability approaching one,
and
where denotes the number of nonzero components in .
-
(vii)
For , let where is the standard normal CDF and is a constant.
Assumption 13 is standard in the literature and we refer interested readers to Belloni et al. (2017) for more discussion. Assumption 13(i) implies the logistic model is approximately correctly specified. As the approximation is assumed to be sparse, the condition is not innocuous in the high-dimensional setting. As our method is valid even when the auxiliary model is misspecified, we conjecture that Assumption 13(i) can be relaxed, which links to the recent literature on the study of regularized estimation in the high-dimensional setting under misspecification: see, for example, Bradic et al. (2019) and Tan (2020) and the references therein. An interesting topic for future work is to study misspecification-robust high-dimensional estimators of the conditional probability model and their use to adjust the QTE estimator under CAR based on (3.1) and (3.2). The following theorem shows that all the estimation and inference results in Theorems 3 and 5 hold for the HD estimator.
Theorem A.1.
Denote and as the th QTE estimator and its multiplier bootstrap counterpart defined in Sections 3 and 4, respectively, with and defined in (A.3). Further suppose Assumptions 1, 2, 4, 8, and 13 hold. Then, Assumptions 3 and 5 hold, which further imply Theorems 3 and 5 hold for and , respectively. In addition, for any finite-dimensional quantile indices , the covariance matrix of achieves the minimum (in the matrix sense) as characterized in Theorem 3.
Appendix B Practical Guidance and Computation
B.1 Procedures for estimation and bootstrap inference
We can compute by solving the subgradient conditions of (3.1) and (3.2), respectively. Specifically, we have such that , ,
(B.1) |
and
(B.2) |
We note are uniquely defined as long as all the inequalities in (B.1) and (B.2) are strict, which is usually the case. If we have
then both and satisfy (B.1), where is the index such that and is the smallest observation in the treatment group that is larger than . In this case, we let .151515In this case, any value that belongs to can be viewed as a solution. Because , all choices are asymptotically equivalent. Similarly, by solving the subgradient conditions of (B.3) and (B.4), we have such that , ,
(B.3) |
and
(B.4) |
The inequalities in (B.3) and (B.4) are strict with probability one if is continuously distributed. In this case, are uniquely defined with probability one.
We summarize the steps in the bootstrap procedure as follows.
- 1.
-
2.
Compute for and using and .
- 3.
- 4.
-
5.
Repeat the above step for and obtain bootstrap estimates of the QTE, denoted as .
B.2 Bootstrap confidence intervals
Given the bootstrap estimates, we next discuss how to conduct bootstrap inference for the null hypotheses with single, multiple, and a continuum of quantile indices.
Case (1). We test the single null hypothesis that vs. . Set in the procedures described above and let and be the th empirical quantile of the sequence and the th standard normal critical value, respectively. Let be the significance level. We suggest using the bootstrap estimator to construct the standard error of as . Then the valid confidence interval and Wald test using this standard error are
and , respectively.161616It is asymptotically valid to use standard and percentile bootstrap confidence intervals. But in simulations we found that the confidence interval proposed in the paper has better finite sample performance in terms of coverage rates under the null.
Case (2). We test the null hypothesis that vs. . In this case, we have in the procedure described in Section B.1. Further, let be the th empirical quantile of the sequence , and let be the significance level. We suggest using the bootstrap standard error to construct the valid confidence interval and Wald test as
and , respectively, where .
Case (3). We test the null hypothesis that
In theory, we should let . In practice, we let be a fine grid of where should be as large as computationally possible. Further, let denote the th empirical quantile of the sequence for . Compute the standard error of as
The uniform confidence band with an significance level is constructed as
where the critical value is computed as
and is first-order equivalent to in the sense that . We suggest choosing over other choices such as due to its better finite sample performance. We reject at significance level if
B.3 Computation of Auxiliary Regressions
Parametric regressions.
For the linear probability model, we compute the LP estimator via (5.7). For the logistic model, we consider the ML and the LPML estimators. First, we compute the ML estimator as in (5.9), which is the quasi maximum likelihood estimator of a flexible distribution regression. Second, we propose to compute the logistic function values with and treat them as regressors in a linear adjustment to further improve the ML estimate.
Sieve logistic regressions.
We provide more detail on the sieve basis. Recall , where are basis functions of a linear sieve space, denoted as . Given that all elements of are continuously distributed, the sieve space can be constructed as follows.
-
1.
For each element of , , let be the univariate sieve space of dimension . One example of is the linear span of the dimensional polynomials given by
Another is the linear span of -order splines with nodes given by
where the grid partitions into subsets , , , and .
-
2.
Let be the tensor product of , which is defined as a linear space spanned by the functions , where . The dimension of is then .
Logistic regressions with an penalization.
We follow the estimation procedure and the choice of tuning parameter proposed by Belloni et al. (2017). We provide details below for completeness. Recall . We set following Belloni et al. (2017). We then implement the following algorithm to estimate for :
-
(i)
Let for , where . Estimate
-
(ii)
For , obtain , where . Estimate
-
(iii)
Let .
-
(iv)
Repeat the above procedure for .
Appendix C Additional Simulation Results
C.1 Pointwise tests
Additional simulation results are provided for pointwise tests at 25% and 75% quantiles. The results are summarized in Tables 7 and 8. The simulation settings are the same as the pointwise test simulations in Section 6 of the main paper.
Size | Power | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Methods | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR |
Panel A: DGP (i) | ||||||||||||||||
NA | 0.053 | 0.057 | 0.052 | 0.052 | 0.056 | 0.054 | 0.053 | 0.052 | 0.334 | 0.341 | 0.346 | 0.344 | 0.579 | 0.591 | 0.595 | 0.607 |
LP | 0.054 | 0.056 | 0.052 | 0.054 | 0.053 | 0.057 | 0.053 | 0.052 | 0.391 | 0.406 | 0.405 | 0.394 | 0.683 | 0.696 | 0.693 | 0.694 |
ML | 0.057 | 0.055 | 0.055 | 0.054 | 0.053 | 0.055 | 0.049 | 0.051 | 0.380 | 0.387 | 0.389 | 0.378 | 0.673 | 0.678 | 0.683 | 0.674 |
LPML | 0.056 | 0.056 | 0.057 | 0.057 | 0.052 | 0.056 | 0.055 | 0.056 | 0.410 | 0.418 | 0.417 | 0.409 | 0.714 | 0.722 | 0.717 | 0.715 |
MLX | 0.058 | 0.060 | 0.057 | 0.057 | 0.051 | 0.055 | 0.053 | 0.057 | 0.387 | 0.394 | 0.398 | 0.388 | 0.668 | 0.674 | 0.677 | 0.677 |
LPMLX | 0.060 | 0.059 | 0.060 | 0.060 | 0.052 | 0.060 | 0.056 | 0.057 | 0.414 | 0.423 | 0.431 | 0.415 | 0.718 | 0.722 | 0.725 | 0.718 |
NP | 0.061 | 0.065 | 0.063 | 0.063 | 0.056 | 0.060 | 0.058 | 0.057 | 0.432 | 0.444 | 0.442 | 0.427 | 0.724 | 0.728 | 0.730 | 0.724 |
Panel B: DGP (ii) | ||||||||||||||||
NA | 0.044 | 0.047 | 0.048 | 0.046 | 0.054 | 0.049 | 0.044 | 0.046 | 0.457 | 0.457 | 0.466 | 0.475 | 0.741 | 0.751 | 0.752 | 0.760 |
LP | 0.050 | 0.050 | 0.050 | 0.049 | 0.054 | 0.052 | 0.045 | 0.044 | 0.541 | 0.542 | 0.545 | 0.538 | 0.824 | 0.830 | 0.831 | 0.825 |
ML | 0.049 | 0.049 | 0.052 | 0.054 | 0.050 | 0.051 | 0.049 | 0.045 | 0.478 | 0.477 | 0.480 | 0.474 | 0.757 | 0.761 | 0.760 | 0.761 |
LPML | 0.053 | 0.052 | 0.056 | 0.054 | 0.053 | 0.050 | 0.048 | 0.044 | 0.542 | 0.540 | 0.544 | 0.536 | 0.832 | 0.837 | 0.840 | 0.833 |
MLX | 0.055 | 0.057 | 0.057 | 0.058 | 0.054 | 0.053 | 0.048 | 0.045 | 0.507 | 0.504 | 0.505 | 0.500 | 0.765 | 0.771 | 0.771 | 0.770 |
LPMLX | 0.055 | 0.061 | 0.061 | 0.057 | 0.057 | 0.054 | 0.050 | 0.046 | 0.572 | 0.567 | 0.572 | 0.563 | 0.848 | 0.850 | 0.852 | 0.845 |
NP | 0.063 | 0.065 | 0.063 | 0.061 | 0.058 | 0.056 | 0.052 | 0.051 | 0.575 | 0.571 | 0.576 | 0.572 | 0.847 | 0.852 | 0.854 | 0.847 |
Size | Power | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Methods | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR |
Panel A: DGP (i) | ||||||||||||||||
NA | 0.054 | 0.055 | 0.056 | 0.054 | 0.057 | 0.052 | 0.051 | 0.051 | 0.348 | 0.342 | 0.352 | 0.338 | 0.583 | 0.594 | 0.601 | 0.576 |
LP | 0.055 | 0.052 | 0.056 | 0.052 | 0.050 | 0.053 | 0.053 | 0.049 | 0.428 | 0.418 | 0.424 | 0.432 | 0.686 | 0.698 | 0.698 | 0.697 |
ML | 0.051 | 0.053 | 0.054 | 0.051 | 0.050 | 0.051 | 0.054 | 0.054 | 0.402 | 0.395 | 0.403 | 0.402 | 0.658 | 0.667 | 0.666 | 0.667 |
LPML | 0.056 | 0.056 | 0.058 | 0.056 | 0.050 | 0.053 | 0.056 | 0.051 | 0.428 | 0.425 | 0.437 | 0.435 | 0.706 | 0.711 | 0.711 | 0.709 |
MLX | 0.059 | 0.056 | 0.057 | 0.057 | 0.054 | 0.054 | 0.053 | 0.055 | 0.404 | 0.399 | 0.404 | 0.406 | 0.654 | 0.663 | 0.662 | 0.657 |
LPMLX | 0.058 | 0.058 | 0.061 | 0.057 | 0.052 | 0.056 | 0.056 | 0.053 | 0.425 | 0.421 | 0.432 | 0.432 | 0.702 | 0.711 | 0.710 | 0.710 |
NP | 0.061 | 0.061 | 0.065 | 0.062 | 0.056 | 0.057 | 0.058 | 0.057 | 0.441 | 0.435 | 0.441 | 0.447 | 0.706 | 0.710 | 0.711 | 0.711 |
Panel B: DGP (ii) | ||||||||||||||||
NA | 0.052 | 0.054 | 0.053 | 0.050 | 0.047 | 0.047 | 0.050 | 0.051 | 0.325 | 0.331 | 0.325 | 0.311 | 0.557 | 0.550 | 0.546 | 0.538 |
LP | 0.055 | 0.055 | 0.062 | 0.053 | 0.051 | 0.053 | 0.053 | 0.054 | 0.389 | 0.402 | 0.396 | 0.394 | 0.626 | 0.626 | 0.634 | 0.634 |
ML | 0.056 | 0.055 | 0.057 | 0.055 | 0.049 | 0.050 | 0.052 | 0.054 | 0.350 | 0.357 | 0.346 | 0.348 | 0.563 | 0.575 | 0.574 | 0.569 |
LPML | 0.054 | 0.057 | 0.060 | 0.055 | 0.053 | 0.051 | 0.055 | 0.053 | 0.390 | 0.400 | 0.394 | 0.388 | 0.635 | 0.640 | 0.639 | 0.644 |
MLX | 0.058 | 0.057 | 0.059 | 0.059 | 0.053 | 0.053 | 0.052 | 0.055 | 0.371 | 0.387 | 0.377 | 0.373 | 0.590 | 0.595 | 0.595 | 0.597 |
LPMLX | 0.062 | 0.060 | 0.065 | 0.058 | 0.055 | 0.055 | 0.057 | 0.056 | 0.416 | 0.425 | 0.420 | 0.421 | 0.658 | 0.663 | 0.663 | 0.668 |
NP | 0.068 | 0.068 | 0.067 | 0.063 | 0.056 | 0.054 | 0.058 | 0.059 | 0.429 | 0.436 | 0.427 | 0.426 | 0.663 | 0.670 | 0.670 | 0.672 |
C.2 Estimation biases and standard errors
In this section we report the biases and standard errors of our regression-adjusted estimators under three test settings. Specifically, the biases and standard errors for pointwise tests are summarized in Tables 9-11. Table 12 reports the biases and standard errors for estimating the difference of QTEs. Table 13 provides the average estimation bias and standard errors over the interval .
Bias | Standard Error | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Methods | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR |
Panel A: DGP (i) | ||||||||||||||||
NA | 0.010 | 0.004 | -0.010 | -0.023 | 0.011 | -0.001 | -0.001 | -0.026 | 0.984 | 0.975 | 0.972 | 0.975 | 0.688 | 0.685 | 0.685 | 0.686 |
LP | 0.024 | 0.013 | 0.005 | 0.036 | 0.016 | 0.006 | 0.004 | 0.009 | 0.882 | 0.874 | 0.872 | 0.872 | 0.610 | 0.607 | 0.608 | 0.607 |
ML | 0.024 | 0.022 | 0.014 | 0.037 | 0.010 | 0.004 | -0.002 | 0.012 | 0.902 | 0.894 | 0.892 | 0.891 | 0.620 | 0.617 | 0.619 | 0.618 |
LPML | 0.011 | 0.010 | 0.002 | 0.028 | 0.005 | 0.000 | -0.009 | 0.006 | 0.867 | 0.863 | 0.857 | 0.860 | 0.596 | 0.592 | 0.595 | 0.592 |
MLX | -0.001 | -0.006 | -0.015 | 0.019 | 0.003 | -0.002 | -0.016 | 0.000 | 0.904 | 0.896 | 0.893 | 0.894 | 0.626 | 0.624 | 0.626 | 0.624 |
LPMLX | 0.005 | -0.002 | -0.012 | 0.015 | 0.000 | -0.007 | -0.014 | 0.000 | 0.867 | 0.861 | 0.857 | 0.858 | 0.594 | 0.591 | 0.593 | 0.592 |
NP | -0.037 | -0.045 | -0.053 | -0.021 | -0.014 | -0.018 | -0.027 | -0.013 | 0.869 | 0.862 | 0.858 | 0.859 | 0.592 | 0.590 | 0.592 | 0.591 |
Panel B: DGP (ii) | ||||||||||||||||
NA | -0.004 | 0.003 | -0.007 | -0.043 | -0.001 | -0.008 | -0.001 | -0.019 | 0.824 | 0.820 | 0.816 | 0.819 | 0.574 | 0.572 | 0.571 | 0.571 |
LP | -0.038 | -0.036 | -0.040 | -0.034 | -0.022 | -0.021 | -0.015 | -0.008 | 0.754 | 0.751 | 0.747 | 0.752 | 0.523 | 0.522 | 0.521 | 0.521 |
ML | -0.032 | -0.027 | -0.036 | -0.033 | -0.023 | -0.020 | -0.013 | -0.011 | 0.816 | 0.813 | 0.808 | 0.814 | 0.573 | 0.571 | 0.570 | 0.570 |
LPML | -0.013 | -0.007 | -0.013 | -0.003 | -0.008 | -0.009 | -0.002 | 0.004 | 0.741 | 0.739 | 0.734 | 0.740 | 0.513 | 0.511 | 0.511 | 0.511 |
MLX | -0.063 | -0.056 | -0.060 | -0.065 | -0.033 | -0.038 | -0.032 | -0.035 | 0.803 | 0.800 | 0.796 | 0.802 | 0.571 | 0.568 | 0.569 | 0.568 |
LPMLX | -0.061 | -0.054 | -0.057 | -0.050 | -0.032 | -0.035 | -0.028 | -0.022 | 0.735 | 0.733 | 0.729 | 0.734 | 0.510 | 0.509 | 0.508 | 0.508 |
NP | -0.068 | -0.066 | -0.072 | -0.062 | -0.039 | -0.041 | -0.033 | -0.026 | 0.734 | 0.732 | 0.729 | 0.733 | 0.510 | 0.509 | 0.508 | 0.508 |
Bias | Standard Error | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Methods | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR |
Panel A: DGP (i) | ||||||||||||||||
NA | -0.001 | 0.020 | 0.016 | 0.043 | -0.003 | -0.006 | -0.005 | 0.019 | 0.979 | 0.976 | 0.976 | 0.973 | 0.688 | 0.685 | 0.685 | 0.687 |
LP | -0.021 | -0.005 | -0.006 | -0.017 | -0.009 | -0.008 | -0.006 | -0.010 | 0.875 | 0.873 | 0.871 | 0.871 | 0.610 | 0.606 | 0.607 | 0.609 |
ML | 0.006 | 0.023 | 0.013 | 0.014 | 0.006 | 0.004 | 0.005 | 0.004 | 0.897 | 0.894 | 0.893 | 0.893 | 0.627 | 0.623 | 0.623 | 0.625 |
LPML | -0.004 | 0.013 | 0.001 | -0.002 | 0.001 | -0.001 | 0.003 | -0.004 | 0.860 | 0.858 | 0.856 | 0.856 | 0.597 | 0.592 | 0.592 | 0.595 |
MLX | -0.004 | 0.016 | 0.006 | 0.003 | 0.003 | -0.001 | 0.006 | 0.000 | 0.898 | 0.894 | 0.894 | 0.894 | 0.631 | 0.627 | 0.628 | 0.630 |
LPMLX | 0.008 | 0.025 | 0.011 | 0.005 | 0.010 | 0.006 | 0.008 | 0.003 | 0.860 | 0.858 | 0.855 | 0.855 | 0.594 | 0.592 | 0.591 | 0.593 |
NP | -0.021 | -0.006 | -0.014 | -0.028 | 0.003 | 0.002 | 0.004 | -0.003 | 0.859 | 0.858 | 0.855 | 0.855 | 0.593 | 0.589 | 0.589 | 0.592 |
Panel B: DGP (ii) | ||||||||||||||||
NA | 0.032 | 0.017 | 0.032 | 0.091 | 0.003 | 0.013 | 0.019 | 0.039 | 1.036 | 1.026 | 1.023 | 1.025 | 0.723 | 0.720 | 0.718 | 0.718 |
LP | -0.012 | -0.034 | -0.020 | -0.015 | -0.012 | -0.003 | 0.001 | -0.008 | 0.944 | 0.937 | 0.932 | 0.936 | 0.660 | 0.658 | 0.656 | 0.655 |
ML | 0.025 | 0.010 | 0.026 | 0.029 | 0.010 | 0.009 | 0.015 | 0.010 | 1.006 | 1.000 | 0.997 | 0.998 | 0.709 | 0.705 | 0.702 | 0.702 |
LPML | 0.020 | 0.013 | 0.029 | 0.031 | 0.016 | 0.019 | 0.028 | 0.013 | 0.930 | 0.920 | 0.919 | 0.924 | 0.644 | 0.640 | 0.638 | 0.637 |
MLX | -0.008 | -0.038 | -0.021 | -0.001 | -0.015 | -0.012 | -0.008 | -0.008 | 0.981 | 0.972 | 0.969 | 0.970 | 0.692 | 0.690 | 0.688 | 0.688 |
LPMLX | -0.021 | -0.041 | -0.025 | -0.024 | -0.017 | -0.013 | -0.006 | -0.017 | 0.916 | 0.912 | 0.906 | 0.908 | 0.636 | 0.635 | 0.633 | 0.631 |
NP | -0.043 | -0.059 | -0.042 | -0.041 | -0.026 | -0.019 | -0.012 | -0.023 | 0.910 | 0.903 | 0.901 | 0.903 | 0.634 | 0.632 | 0.631 | 0.630 |
Bias | Standard Error | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Methods | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR |
Panel A: DGP (i) | ||||||||||||||||
NA | 0.011 | 0.010 | 0.001 | 0.003 | 0.004 | 0.004 | -0.004 | -0.008 | 0.885 | 0.882 | 0.883 | 0.883 | 0.622 | 0.619 | 0.618 | 0.619 |
LP | 0.002 | 0.002 | -0.006 | 0.011 | 0.005 | 0.003 | 0.000 | -0.008 | 0.779 | 0.776 | 0.775 | 0.776 | 0.545 | 0.542 | 0.541 | 0.542 |
ML | 0.012 | 0.012 | 0.000 | 0.018 | 0.009 | 0.007 | 0.002 | -0.006 | 0.795 | 0.791 | 0.790 | 0.793 | 0.557 | 0.554 | 0.553 | 0.553 |
LPML | 0.004 | 0.006 | -0.007 | 0.009 | 0.004 | 0.005 | 0.002 | -0.004 | 0.759 | 0.756 | 0.755 | 0.757 | 0.528 | 0.525 | 0.524 | 0.525 |
MLX | 0.000 | 0.002 | -0.006 | 0.002 | 0.006 | 0.001 | 0.003 | -0.004 | 0.796 | 0.791 | 0.790 | 0.793 | 0.561 | 0.559 | 0.558 | 0.558 |
LPMLX | 0.004 | 0.007 | -0.007 | 0.007 | 0.006 | 0.006 | 0.001 | -0.004 | 0.758 | 0.755 | 0.753 | 0.756 | 0.527 | 0.524 | 0.523 | 0.524 |
NP | -0.023 | -0.017 | -0.028 | -0.015 | 0.004 | 0.004 | 0.000 | -0.005 | 0.758 | 0.755 | 0.753 | 0.755 | 0.526 | 0.524 | 0.523 | 0.524 |
Panel B: DGP (ii) | ||||||||||||||||
NA | 0.024 | 0.019 | 0.007 | 0.025 | 0.005 | 0.006 | 0.014 | 0.004 | 0.783 | 0.777 | 0.777 | 0.777 | 0.547 | 0.545 | 0.545 | 0.546 |
LP | -0.009 | -0.012 | -0.029 | -0.011 | -0.013 | -0.008 | -0.004 | -0.010 | 0.711 | 0.707 | 0.707 | 0.707 | 0.495 | 0.494 | 0.493 | 0.494 |
ML | 0.009 | 0.009 | -0.008 | 0.005 | -0.008 | 0.001 | 0.003 | 0.001 | 0.744 | 0.74 | 0.739 | 0.74 | 0.523 | 0.522 | 0.522 | 0.522 |
LPML | 0.026 | 0.025 | 0.013 | 0.021 | 0.011 | 0.018 | 0.020 | 0.017 | 0.697 | 0.696 | 0.696 | 0.693 | 0.481 | 0.48 | 0.479 | 0.482 |
MLX | -0.037 | -0.039 | -0.049 | -0.037 | -0.028 | -0.019 | -0.018 | -0.018 | 0.727 | 0.724 | 0.723 | 0.723 | 0.517 | 0.516 | 0.515 | 0.516 |
LPMLX | -0.045 | -0.043 | -0.058 | -0.046 | -0.027 | -0.019 | -0.015 | -0.019 | 0.686 | 0.685 | 0.684 | 0.684 | 0.479 | 0.476 | 0.476 | 0.479 |
NP | -0.056 | -0.061 | -0.074 | -0.061 | -0.036 | -0.027 | -0.023 | -0.027 | 0.685 | 0.681 | 0.682 | 0.681 | 0.474 | 0.473 | 0.473 | 0.473 |
Bias | Standard Error | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Methods | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR |
Panel A: DGP (i) | ||||||||||||||||
NA | -0.011 | 0.015 | 0.029 | 0.067 | -0.013 | -0.007 | 0.000 | 0.045 | 1.310 | 1.301 | 1.299 | 1.299 | 0.911 | 0.905 | 0.906 | 0.909 |
LP | -0.044 | -0.020 | -0.007 | -0.052 | -0.025 | -0.015 | -0.007 | -0.019 | 1.255 | 1.248 | 1.246 | 1.245 | 0.869 | 0.864 | 0.865 | 0.867 |
ML | -0.029 | -0.012 | -0.005 | -0.034 | -0.009 | -0.006 | 0.001 | -0.015 | 1.253 | 1.243 | 1.241 | 1.241 | 0.865 | 0.860 | 0.862 | 0.862 |
LPML | -0.028 | -0.008 | -0.010 | -0.043 | -0.010 | -0.007 | 0.009 | -0.015 | 1.201 | 1.194 | 1.190 | 1.191 | 0.823 | 0.819 | 0.820 | 0.822 |
MLX | -0.012 | 0.000 | 0.012 | -0.016 | 0.003 | 0.000 | 0.024 | -0.001 | 1.250 | 1.244 | 1.239 | 1.241 | 0.870 | 0.865 | 0.867 | 0.868 |
LPMLX | -0.019 | 0.006 | 0.006 | -0.029 | 0.007 | 0.011 | 0.025 | 0.002 | 1.197 | 1.190 | 1.187 | 1.187 | 0.822 | 0.817 | 0.818 | 0.819 |
NP | -0.010 | 0.014 | 0.018 | -0.026 | 0.009 | 0.011 | 0.028 | 0.002 | 1.198 | 1.192 | 1.188 | 1.189 | 0.819 | 0.814 | 0.816 | 0.818 |
Panel B: DGP (ii) | ||||||||||||||||
NA | 0.038 | 0.016 | 0.040 | 0.135 | 0.007 | 0.022 | 0.021 | 0.060 | 1.280 | 1.270 | 1.264 | 1.268 | 0.886 | 0.882 | 0.881 | 0.880 |
LP | 0.029 | 0.002 | 0.021 | 0.020 | 0.012 | 0.018 | 0.017 | 0.000 | 1.201 | 1.192 | 1.186 | 1.191 | 0.831 | 0.829 | 0.827 | 0.826 |
ML | 0.053 | 0.040 | 0.048 | 0.061 | 0.034 | 0.030 | 0.029 | 0.026 | 1.286 | 1.279 | 1.270 | 1.276 | 0.898 | 0.894 | 0.892 | 0.893 |
LPML | 0.030 | 0.014 | 0.038 | 0.027 | 0.024 | 0.027 | 0.031 | 0.010 | 1.180 | 1.169 | 1.164 | 1.172 | 0.811 | 0.808 | 0.806 | 0.806 |
MLX | 0.054 | 0.032 | 0.043 | 0.054 | 0.022 | 0.024 | 0.012 | 0.016 | 1.258 | 1.253 | 1.247 | 1.252 | 0.889 | 0.884 | 0.884 | 0.882 |
LPMLX | 0.028 | 0.003 | 0.023 | 0.013 | 0.013 | 0.021 | 0.021 | 0.004 | 1.165 | 1.159 | 1.152 | 1.157 | 0.804 | 0.803 | 0.801 | 0.799 |
NP | 0.019 | 0.001 | 0.023 | 0.010 | 0.014 | 0.021 | 0.021 | 0.003 | 1.160 | 1.153 | 1.149 | 1.152 | 0.804 | 0.801 | 0.799 | 0.799 |
Bias | Standard Error | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Methods | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR |
Panel A: DGP (i) | ||||||||||||||||
NA | 0.003 | 0.012 | 0.000 | 0.010 | 0.002 | 0.000 | -0.004 | -0.004 | 0.931 | 0.927 | 0.925 | 0.925 | 0.653 | 0.651 | 0.651 | 0.652 |
LP | 0.001 | 0.004 | -0.003 | 0.008 | 0.002 | 0.001 | -0.001 | -0.002 | 0.829 | 0.825 | 0.824 | 0.824 | 0.576 | 0.575 | 0.575 | 0.575 |
ML | 0.012 | 0.016 | 0.005 | 0.019 | 0.005 | 0.006 | 0.001 | 0.003 | 0.851 | 0.847 | 0.845 | 0.846 | 0.593 | 0.591 | 0.591 | 0.591 |
LPML | 0.002 | 0.008 | -0.003 | 0.008 | 0.002 | 0.001 | -0.002 | -0.002 | 0.814 | 0.811 | 0.808 | 0.810 | 0.562 | 0.560 | 0.560 | 0.560 |
MLX | -0.002 | 0.006 | -0.008 | 0.009 | 0.000 | 0.001 | -0.002 | -0.003 | 0.850 | 0.847 | 0.845 | 0.846 | 0.596 | 0.594 | 0.594 | 0.594 |
LPMLX | 0.003 | 0.010 | -0.004 | 0.007 | 0.003 | 0.001 | -0.001 | -0.002 | 0.813 | 0.810 | 0.808 | 0.808 | 0.561 | 0.559 | 0.559 | 0.560 |
NP | -0.024 | -0.017 | -0.029 | -0.019 | -0.002 | -0.003 | -0.005 | -0.006 | 0.813 | 0.810 | 0.808 | 0.808 | 0.560 | 0.558 | 0.558 | 0.558 |
Panel B: DGP (ii) | ||||||||||||||||
NA | 0.012 | 0.013 | 0.010 | 0.024 | 0.002 | 0.003 | 0.010 | 0.005 | 0.850 | 0.843 | 0.841 | 0.843 | 0.593 | 0.591 | 0.590 | 0.590 |
LP | -0.024 | -0.022 | -0.027 | -0.017 | -0.015 | -0.011 | -0.007 | -0.010 | 0.772 | 0.767 | 0.765 | 0.766 | 0.537 | 0.535 | 0.534 | 0.534 |
ML | 0.001 | 0.000 | 0.000 | 0.009 | -0.003 | -0.001 | 0.005 | 0.001 | 0.820 | 0.816 | 0.814 | 0.816 | 0.579 | 0.577 | 0.575 | 0.575 |
LPML | 0.015 | 0.016 | 0.015 | 0.023 | 0.010 | 0.014 | 0.018 | 0.014 | 0.761 | 0.757 | 0.755 | 0.757 | 0.525 | 0.523 | 0.522 | 0.522 |
MLX | -0.035 | -0.038 | -0.038 | -0.030 | -0.022 | -0.024 | -0.019 | -0.020 | 0.801 | 0.797 | 0.795 | 0.796 | 0.569 | 0.567 | 0.566 | 0.566 |
LPMLX | -0.039 | -0.040 | -0.039 | -0.034 | -0.021 | -0.020 | -0.015 | -0.018 | 0.752 | 0.748 | 0.745 | 0.746 | 0.521 | 0.519 | 0.518 | 0.519 |
NP | -0.055 | -0.059 | -0.057 | -0.052 | -0.033 | -0.030 | -0.024 | -0.028 | 0.746 | 0.742 | 0.741 | 0.742 | 0.516 | 0.515 | 0.514 | 0.514 |
C.3 Naïve Bootstrap Inference
In this section we report the size and power of our regression-adjusted estimator for the median QTE when we replace by the true propensity score . We then consider the multiplier bootstrap as defined in the main text but with replaced by the true propensity score . We call this the naive bootstrap inference because the simulation results below show that it is conservative. Specifically, we report addition simulation results for the pointwise tests with , and (Tables 14-16), tests for differences (Table 17), and uniform tests (Table 18).
Comparing the results with the ones in Section 6, we see that using the true, instead of the estimated, propensity score, the multiplier bootstrap inference becomes conservative for randomization schemes “WEI”, “BCD”, and “SBR”. Specifically, the sizes are much smaller than the nominal rate (). At the same time, the powers are smaller than their counterparts in Section 6. The improvement in powers of the “LPMLX” estimator with the estimated propensity score over the “LPMLX” estimator with the true propensity score is due to the 31–38% reduction in the standard errors. This outcome is consistent with the findings in Bugni et al. (2018) and Zhang and Zheng (2020) that the naive inference methods under CARs are conservative.
Size | Power | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Methods | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR |
Panel A: DGP (i) | ||||||||||||||||
NA | 0.049 | 0.032 | 0.022 | 0.023 | 0.056 | 0.032 | 0.023 | 0.023 | 0.255 | 0.234 | 0.224 | 0.220 | 0.454 | 0.454 | 0.456 | 0.465 |
LP | 0.048 | 0.018 | 0.008 | 0.006 | 0.053 | 0.019 | 0.006 | 0.007 | 0.222 | 0.173 | 0.149 | 0.113 | 0.399 | 0.391 | 0.372 | 0.331 |
ML | 0.048 | 0.034 | 0.032 | 0.031 | 0.048 | 0.044 | 0.037 | 0.037 | 0.330 | 0.292 | 0.279 | 0.253 | 0.622 | 0.614 | 0.602 | 0.592 |
LPML | 0.051 | 0.034 | 0.031 | 0.029 | 0.050 | 0.041 | 0.038 | 0.038 | 0.346 | 0.311 | 0.296 | 0.270 | 0.643 | 0.630 | 0.622 | 0.606 |
MLX | 0.049 | 0.039 | 0.033 | 0.034 | 0.053 | 0.046 | 0.041 | 0.042 | 0.334 | 0.300 | 0.294 | 0.265 | 0.621 | 0.617 | 0.608 | 0.597 |
LPMLX | 0.052 | 0.040 | 0.033 | 0.032 | 0.049 | 0.046 | 0.042 | 0.041 | 0.353 | 0.326 | 0.308 | 0.280 | 0.656 | 0.647 | 0.640 | 0.629 |
NP | 0.054 | 0.045 | 0.037 | 0.035 | 0.053 | 0.049 | 0.045 | 0.044 | 0.370 | 0.347 | 0.327 | 0.302 | 0.679 | 0.672 | 0.663 | 0.650 |
Panel B: DGP (ii) | ||||||||||||||||
NA | 0.050 | 0.019 | 0.008 | 0.009 | 0.053 | 0.019 | 0.007 | 0.008 | 0.276 | 0.247 | 0.225 | 0.237 | 0.498 | 0.498 | 0.501 | 0.520 |
LP | 0.052 | 0.013 | 0.002 | 0.002 | 0.053 | 0.011 | 0.001 | 0.002 | 0.238 | 0.178 | 0.148 | 0.109 | 0.432 | 0.413 | 0.389 | 0.363 |
ML | 0.046 | 0.041 | 0.041 | 0.038 | 0.045 | 0.039 | 0.037 | 0.033 | 0.443 | 0.443 | 0.446 | 0.435 | 0.720 | 0.726 | 0.726 | 0.724 |
LPML | 0.047 | 0.039 | 0.037 | 0.033 | 0.047 | 0.038 | 0.031 | 0.030 | 0.444 | 0.431 | 0.440 | 0.435 | 0.720 | 0.735 | 0.736 | 0.741 |
MLX | 0.052 | 0.046 | 0.044 | 0.045 | 0.048 | 0.044 | 0.041 | 0.038 | 0.469 | 0.470 | 0.472 | 0.464 | 0.732 | 0.744 | 0.743 | 0.748 |
LPMLX | 0.053 | 0.049 | 0.052 | 0.047 | 0.053 | 0.047 | 0.042 | 0.041 | 0.531 | 0.527 | 0.531 | 0.527 | 0.818 | 0.829 | 0.834 | 0.830 |
NP | 0.058 | 0.056 | 0.056 | 0.057 | 0.052 | 0.053 | 0.050 | 0.049 | 0.548 | 0.553 | 0.556 | 0.553 | 0.841 | 0.848 | 0.850 | 0.842 |
Size | Power | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Methods | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR |
Panel A: DGP (i) | ||||||||||||||||
NA | 0.056 | 0.026 | 0.015 | 0.019 | 0.053 | 0.028 | 0.014 | 0.015 | 0.291 | 0.255 | 0.242 | 0.236 | 0.492 | 0.501 | 0.503 | 0.504 |
LP | 0.049 | 0.008 | 0.001 | 0.001 | 0.052 | 0.008 | 0.001 | 0.000 | 0.181 | 0.103 | 0.053 | 0.033 | 0.314 | 0.254 | 0.191 | 0.158 |
ML | 0.050 | 0.018 | 0.008 | 0.007 | 0.051 | 0.018 | 0.006 | 0.007 | 0.273 | 0.226 | 0.198 | 0.166 | 0.479 | 0.479 | 0.479 | 0.458 |
LPML | 0.049 | 0.017 | 0.006 | 0.006 | 0.048 | 0.017 | 0.006 | 0.005 | 0.276 | 0.217 | 0.198 | 0.155 | 0.495 | 0.492 | 0.491 | 0.476 |
MLX | 0.050 | 0.018 | 0.008 | 0.008 | 0.052 | 0.021 | 0.008 | 0.007 | 0.270 | 0.229 | 0.211 | 0.169 | 0.482 | 0.472 | 0.473 | 0.455 |
LPMLX | 0.053 | 0.017 | 0.007 | 0.007 | 0.050 | 0.018 | 0.006 | 0.006 | 0.284 | 0.232 | 0.209 | 0.169 | 0.500 | 0.498 | 0.498 | 0.479 |
NP | 0.055 | 0.020 | 0.008 | 0.008 | 0.052 | 0.021 | 0.006 | 0.007 | 0.291 | 0.243 | 0.227 | 0.184 | 0.507 | 0.507 | 0.507 | 0.490 |
Panel B: DGP (ii) | ||||||||||||||||
NA | 0.051 | 0.017 | 0.005 | 0.005 | 0.048 | 0.016 | 0.006 | 0.006 | 0.284 | 0.244 | 0.223 | 0.211 | 0.499 | 0.491 | 0.486 | 0.494 |
LP | 0.047 | 0.006 | 0.000 | 0.000 | 0.048 | 0.004 | 0.000 | 0.000 | 0.171 | 0.087 | 0.041 | 0.019 | 0.315 | 0.229 | 0.150 | 0.116 |
ML | 0.050 | 0.021 | 0.013 | 0.011 | 0.047 | 0.026 | 0.019 | 0.018 | 0.315 | 0.254 | 0.216 | 0.187 | 0.588 | 0.552 | 0.528 | 0.509 |
LPML | 0.049 | 0.016 | 0.009 | 0.006 | 0.045 | 0.017 | 0.010 | 0.009 | 0.296 | 0.229 | 0.188 | 0.166 | 0.526 | 0.484 | 0.462 | 0.440 |
MLX | 0.056 | 0.021 | 0.014 | 0.013 | 0.052 | 0.027 | 0.019 | 0.021 | 0.336 | 0.275 | 0.243 | 0.213 | 0.606 | 0.564 | 0.554 | 0.535 |
LPMLX | 0.055 | 0.020 | 0.013 | 0.011 | 0.054 | 0.021 | 0.017 | 0.016 | 0.341 | 0.280 | 0.240 | 0.210 | 0.598 | 0.565 | 0.550 | 0.525 |
NP | 0.059 | 0.023 | 0.015 | 0.014 | 0.056 | 0.028 | 0.024 | 0.023 | 0.368 | 0.303 | 0.269 | 0.237 | 0.651 | 0.614 | 0.602 | 0.578 |
Size | Power | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Methods | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR |
Panel A: DGP (i) | ||||||||||||||||
NA | 0.054 | 0.033 | 0.024 | 0.023 | 0.054 | 0.030 | 0.022 | 0.019 | 0.265 | 0.241 | 0.232 | 0.218 | 0.458 | 0.462 | 0.462 | 0.441 |
LP | 0.015 | 0.002 | 0.000 | 0.000 | 0.029 | 0.002 | 0.000 | 0.000 | 0.089 | 0.022 | 0.002 | 0.001 | 0.162 | 0.073 | 0.012 | 0.006 |
ML | 0.028 | 0.003 | 0.001 | 0.000 | 0.044 | 0.005 | 0.001 | 0.000 | 0.127 | 0.050 | 0.015 | 0.008 | 0.228 | 0.137 | 0.067 | 0.053 |
LPML | 0.028 | 0.003 | 0.000 | 0.000 | 0.045 | 0.005 | 0.000 | 0.000 | 0.126 | 0.046 | 0.013 | 0.008 | 0.232 | 0.138 | 0.063 | 0.045 |
MLX | 0.028 | 0.003 | 0.001 | 0.000 | 0.045 | 0.005 | 0.000 | 0.000 | 0.127 | 0.050 | 0.016 | 0.008 | 0.228 | 0.141 | 0.066 | 0.053 |
LPMLX | 0.028 | 0.003 | 0.000 | 0.000 | 0.045 | 0.005 | 0.000 | 0.000 | 0.127 | 0.049 | 0.014 | 0.009 | 0.232 | 0.140 | 0.064 | 0.047 |
NP | 0.030 | 0.003 | 0.000 | 0.000 | 0.044 | 0.006 | 0.000 | 0.001 | 0.134 | 0.052 | 0.017 | 0.009 | 0.234 | 0.145 | 0.069 | 0.053 |
Panel B: DGP (ii) | ||||||||||||||||
NA | 0.051 | 0.027 | 0.019 | 0.016 | 0.050 | 0.028 | 0.016 | 0.019 | 0.239 | 0.210 | 0.192 | 0.169 | 0.409 | 0.392 | 0.384 | 0.371 |
LP | 0.009 | 0.001 | 0.000 | 0.000 | 0.024 | 0.002 | 0.000 | 0.000 | 0.067 | 0.018 | 0.002 | 0.000 | 0.144 | 0.056 | 0.007 | 0.002 |
ML | 0.021 | 0.003 | 0.000 | 0.000 | 0.040 | 0.004 | 0.001 | 0.000 | 0.103 | 0.040 | 0.011 | 0.005 | 0.191 | 0.102 | 0.041 | 0.031 |
LPML | 0.027 | 0.003 | 0.000 | 0.000 | 0.041 | 0.004 | 0.000 | 0.000 | 0.101 | 0.034 | 0.008 | 0.004 | 0.185 | 0.094 | 0.030 | 0.026 |
MLX | 0.023 | 0.003 | 0.000 | 0.000 | 0.040 | 0.005 | 0.000 | 0.000 | 0.107 | 0.044 | 0.011 | 0.005 | 0.195 | 0.107 | 0.044 | 0.035 |
LPMLX | 0.024 | 0.003 | 0.000 | 0.000 | 0.041 | 0.004 | 0.000 | 0.000 | 0.110 | 0.043 | 0.011 | 0.004 | 0.199 | 0.109 | 0.042 | 0.032 |
NP | 0.027 | 0.002 | 0.000 | 0.000 | 0.040 | 0.003 | 0.000 | 0.000 | 0.113 | 0.045 | 0.013 | 0.005 | 0.205 | 0.113 | 0.044 | 0.035 |
Size | Power | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Methods | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR |
Panel A: DGP (i) | ||||||||||||||||
NA | 0.035 | 0.033 | 0.028 | 0.029 | 0.043 | 0.035 | 0.030 | 0.033 | 0.187 | 0.184 | 0.170 | 0.159 | 0.345 | 0.343 | 0.335 | 0.311 |
LP | 0.010 | 0.006 | 0.001 | 0.001 | 0.029 | 0.008 | 0.003 | 0.003 | 0.081 | 0.059 | 0.034 | 0.026 | 0.203 | 0.149 | 0.110 | 0.097 |
ML | 0.027 | 0.008 | 0.003 | 0.002 | 0.042 | 0.009 | 0.002 | 0.002 | 0.105 | 0.060 | 0.030 | 0.025 | 0.190 | 0.127 | 0.080 | 0.072 |
LPML | 0.027 | 0.009 | 0.002 | 0.002 | 0.043 | 0.009 | 0.002 | 0.003 | 0.105 | 0.055 | 0.025 | 0.023 | 0.195 | 0.128 | 0.079 | 0.068 |
MLX | 0.027 | 0.008 | 0.004 | 0.003 | 0.042 | 0.010 | 0.002 | 0.002 | 0.101 | 0.059 | 0.029 | 0.026 | 0.188 | 0.125 | 0.079 | 0.072 |
LPMLX | 0.028 | 0.009 | 0.003 | 0.003 | 0.042 | 0.009 | 0.002 | 0.003 | 0.108 | 0.058 | 0.028 | 0.025 | 0.197 | 0.128 | 0.079 | 0.070 |
NP | 0.028 | 0.009 | 0.002 | 0.003 | 0.044 | 0.010 | 0.002 | 0.003 | 0.110 | 0.057 | 0.027 | 0.025 | 0.198 | 0.130 | 0.078 | 0.070 |
Panel B: DGP (ii) | ||||||||||||||||
NA | 0.037 | 0.026 | 0.020 | 0.021 | 0.040 | 0.028 | 0.023 | 0.026 | 0.167 | 0.165 | 0.152 | 0.122 | 0.330 | 0.318 | 0.306 | 0.294 |
LP | 0.005 | 0.003 | 0.001 | 0.001 | 0.024 | 0.006 | 0.001 | 0.000 | 0.053 | 0.038 | 0.018 | 0.009 | 0.174 | 0.106 | 0.062 | 0.050 |
ML | 0.022 | 0.004 | 0.002 | 0.001 | 0.035 | 0.005 | 0.002 | 0.001 | 0.081 | 0.035 | 0.014 | 0.006 | 0.162 | 0.086 | 0.045 | 0.033 |
LPML | 0.026 | 0.005 | 0.001 | 0.000 | 0.039 | 0.004 | 0.001 | 0.001 | 0.081 | 0.032 | 0.009 | 0.005 | 0.157 | 0.070 | 0.025 | 0.021 |
MLX | 0.023 | 0.005 | 0.001 | 0.001 | 0.034 | 0.007 | 0.001 | 0.001 | 0.082 | 0.038 | 0.013 | 0.008 | 0.165 | 0.092 | 0.043 | 0.038 |
LPMLX | 0.023 | 0.005 | 0.001 | 0.000 | 0.037 | 0.005 | 0.001 | 0.001 | 0.084 | 0.037 | 0.012 | 0.007 | 0.170 | 0.089 | 0.037 | 0.031 |
NP | 0.024 | 0.005 | 0.001 | 0.000 | 0.037 | 0.005 | 0.001 | 0.001 | 0.091 | 0.038 | 0.013 | 0.007 | 0.175 | 0.093 | 0.042 | 0.036 |
Size | Power | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Methods | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR |
Panel A: DGP (i) | ||||||||||||||||
NA | 0.044 | 0.020 | 0.012 | 0.011 | 0.048 | 0.025 | 0.014 | 0.011 | 0.187 | 0.184 | 0.170 | 0.159 | 0.566 | 0.569 | 0.567 | 0.562 |
LP | 0.027 | 0.004 | 0.001 | 0.001 | 0.040 | 0.005 | 0.000 | 0.001 | 0.081 | 0.059 | 0.034 | 0.026 | 0.347 | 0.295 | 0.234 | 0.195 |
ML | 0.031 | 0.011 | 0.007 | 0.006 | 0.043 | 0.016 | 0.009 | 0.010 | 0.105 | 0.060 | 0.030 | 0.025 | 0.630 | 0.588 | 0.562 | 0.541 |
LPML | 0.034 | 0.010 | 0.007 | 0.007 | 0.043 | 0.016 | 0.008 | 0.009 | 0.105 | 0.055 | 0.025 | 0.023 | 0.636 | 0.608 | 0.583 | 0.562 |
MLX | 0.032 | 0.012 | 0.008 | 0.007 | 0.046 | 0.018 | 0.009 | 0.011 | 0.101 | 0.059 | 0.029 | 0.026 | 0.629 | 0.591 | 0.560 | 0.548 |
LPMLX | 0.033 | 0.013 | 0.010 | 0.008 | 0.045 | 0.019 | 0.011 | 0.011 | 0.108 | 0.058 | 0.028 | 0.025 | 0.653 | 0.623 | 0.602 | 0.582 |
NP | 0.034 | 0.013 | 0.008 | 0.009 | 0.047 | 0.021 | 0.012 | 0.012 | 0.110 | 0.057 | 0.027 | 0.025 | 0.673 | 0.646 | 0.624 | 0.604 |
Panel B: DGP (ii) | ||||||||||||||||
NA | 0.043 | 0.013 | 0.005 | 0.003 | 0.050 | 0.011 | 0.003 | 0.004 | 0.308 | 0.251 | 0.213 | 0.208 | 0.562 | 0.570 | 0.566 | 0.572 |
LP | 0.029 | 0.002 | 0.000 | 0.000 | 0.035 | 0.002 | 0.000 | 0.000 | 0.171 | 0.082 | 0.041 | 0.025 | 0.358 | 0.282 | 0.207 | 0.177 |
ML | 0.035 | 0.012 | 0.015 | 0.011 | 0.039 | 0.019 | 0.016 | 0.015 | 0.454 | 0.401 | 0.387 | 0.368 | 0.826 | 0.800 | 0.789 | 0.787 |
LPML | 0.030 | 0.011 | 0.010 | 0.010 | 0.036 | 0.016 | 0.009 | 0.010 | 0.444 | 0.388 | 0.374 | 0.355 | 0.804 | 0.771 | 0.762 | 0.755 |
MLX | 0.037 | 0.017 | 0.019 | 0.014 | 0.040 | 0.022 | 0.017 | 0.019 | 0.500 | 0.450 | 0.439 | 0.418 | 0.853 | 0.830 | 0.826 | 0.819 |
LPMLX | 0.038 | 0.018 | 0.019 | 0.016 | 0.040 | 0.023 | 0.018 | 0.019 | 0.534 | 0.492 | 0.482 | 0.462 | 0.889 | 0.870 | 0.864 | 0.860 |
NP | 0.041 | 0.023 | 0.025 | 0.023 | 0.045 | 0.028 | 0.023 | 0.024 | 0.573 | 0.539 | 0.533 | 0.513 | 0.919 | 0.906 | 0.903 | 0.899 |
C.4 High-dimensional covariates
To assess the finite sample performance of the estimation and inference methods introduced in Section A, we consider the outcomes equation
(C.1) |
where for all cases while , , and are separately specified as follows.
Let follow the standardized Beta(2, 2) distribution, , and . Further suppose that contains twenty covariates , where with and the variance matrix is the Toeplitz matrix
Further define , with , and , where are jointly standard normal.
We consider the post-Lasso estimator as defined in (A.2) with and . The choice of tuning parameter and the estimation procedure are detailed in Section B.3. We assess the empirical size and power of the tests for and . All simulations are replicated 10,000 times, with the bootstrap sample size being 1,000. We compute the true QTEs or QTE differences by simulations with 10,000 sample size and 1,000 replications. To compute power, we perturb the true values by 1.5.
In Table 19, we report the empirical size and power of all three testing scenarios in the high-dimensional setting. In particular, we compare the methods “NA” with our post-Lasso estimator and the oracle estimator. Evidently, all sizes for both methods approach the nominal level as the sample size increases. The post-Lasso method dominates “NA” in all tests with superior power performance. The improvement in power of the “Post-Lasso” estimator upon “NA” (i.e., with no adjustments) is due to a 2.5% reduction of the standard error of the difference of the QTE estimators on average as shown in Table 20. This result is consistent with the theory given in Theorem A.1. The powers of the “Post-Lasso” and “Oracle” estimators are similarly, which also confirms that the “Post-Lasso” estimator achieves the minimum asymptotic variance.
Size | Power | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Cases | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR |
Panel A: NA | ||||||||||||||||
0.049 | 0.046 | 0.046 | 0.050 | 0.050 | 0.049 | 0.046 | 0.045 | 0.649 | 0.646 | 0.645 | 0.660 | 0.915 | 0.911 | 0.915 | 0.917 | |
0.047 | 0.044 | 0.045 | 0.043 | 0.042 | 0.044 | 0.044 | 0.046 | 0.732 | 0.732 | 0.726 | 0.736 | 0.955 | 0.960 | 0.960 | 0.960 | |
0.050 | 0.045 | 0.046 | 0.047 | 0.044 | 0.046 | 0.051 | 0.047 | 0.620 | 0.635 | 0.638 | 0.627 | 0.895 | 0.904 | 0.903 | 0.898 | |
Diff | 0.038 | 0.041 | 0.038 | 0.039 | 0.041 | 0.042 | 0.042 | 0.037 | 0.365 | 0.373 | 0.369 | 0.351 | 0.643 | 0.644 | 0.644 | 0.628 |
Uniform | 0.035 | 0.034 | 0.036 | 0.036 | 0.040 | 0.038 | 0.040 | 0.040 | 0.852 | 0.860 | 0.857 | 0.865 | 0.994 | 0.996 | 0.994 | 0.994 |
Panel B: Post-Lasso | ||||||||||||||||
0.060 | 0.055 | 0.058 | 0.058 | 0.054 | 0.054 | 0.048 | 0.052 | 0.661 | 0.655 | 0.659 | 0.656 | 0.923 | 0.916 | 0.924 | 0.918 | |
0.056 | 0.054 | 0.056 | 0.052 | 0.048 | 0.050 | 0.051 | 0.049 | 0.739 | 0.744 | 0.728 | 0.741 | 0.960 | 0.964 | 0.963 | 0.963 | |
0.059 | 0.055 | 0.055 | 0.056 | 0.052 | 0.050 | 0.056 | 0.055 | 0.627 | 0.644 | 0.648 | 0.643 | 0.902 | 0.911 | 0.907 | 0.904 | |
Diff | 0.052 | 0.050 | 0.048 | 0.050 | 0.048 | 0.046 | 0.046 | 0.043 | 0.377 | 0.380 | 0.381 | 0.373 | 0.657 | 0.665 | 0.655 | 0.659 |
Uniform | 0.051 | 0.051 | 0.048 | 0.046 | 0.049 | 0.046 | 0.048 | 0.049 | 0.872 | 0.881 | 0.877 | 0.882 | 0.996 | 0.998 | 0.996 | 0.996 |
Panel C: Oracle | ||||||||||||||||
0.048 | 0.043 | 0.045 | 0.048 | 0.048 | 0.047 | 0.040 | 0.046 | 0.668 | 0.663 | 0.660 | 0.660 | 0.925 | 0.921 | 0.929 | 0.922 | |
0.041 | 0.044 | 0.044 | 0.040 | 0.042 | 0.044 | 0.043 | 0.044 | 0.745 | 0.746 | 0.739 | 0.749 | 0.962 | 0.967 | 0.967 | 0.967 | |
0.049 | 0.044 | 0.046 | 0.045 | 0.046 | 0.046 | 0.050 | 0.047 | 0.640 | 0.652 | 0.648 | 0.649 | 0.907 | 0.916 | 0.914 | 0.911 | |
Diff | 0.052 | 0.050 | 0.048 | 0.050 | 0.041 | 0.041 | 0.042 | 0.038 | 0.387 | 0.390 | 0.392 | 0.385 | 0.661 | 0.663 | 0.656 | 0.655 |
Uniform | 0.041 | 0.044 | 0.044 | 0.040 | 0.040 | 0.038 | 0.040 | 0.041 | 0.873 | 0.881 | 0.883 | 0.883 | 0.997 | 0.998 | 0.997 | 0.997 |
Bias | Standard Error | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Cases | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR | SRS | WEI | BCD | SBR |
Panel A: NA | ||||||||||||||||
-0.009 | -0.006 | -0.001 | -0.033 | -0.005 | 0.000 | -0.005 | -0.018 | 0.652 | 0.649 | 0.647 | 0.651 | 0.456 | 0.455 | 0.454 | 0.456 | |
0.004 | 0.005 | 0.018 | -0.004 | 0.004 | -0.002 | -0.002 | -0.001 | 0.588 | 0.584 | 0.583 | 0.583 | 0.408 | 0.407 | 0.407 | 0.407 | |
0.022 | 0.006 | 0.016 | 0.028 | 0.017 | 0.001 | 0.007 | 0.016 | 0.652 | 0.651 | 0.650 | 0.648 | 0.457 | 0.456 | 0.454 | 0.456 | |
Diff | 0.011 | 0.022 | 0.012 | 0.067 | 0.007 | 0.008 | 0.013 | 0.032 | 0.922 | 0.917 | 0.916 | 0.917 | 0.644 | 0.642 | 0.641 | 0.643 |
Uniform | 0.008 | 0.000 | -0.004 | -0.002 | 0.000 | -0.001 | 0.000 | 0.008 | 0.624 | 0.621 | 0.620 | 0.621 | 0.436 | 0.435 | 0.435 | 0.435 |
Panel B: Post-Lasso | ||||||||||||||||
0.004 | 0.005 | 0.005 | -0.004 | -0.002 | 0.004 | -0.003 | -0.002 | 0.639 | 0.636 | 0.633 | 0.637 | 0.446 | 0.445 | 0.445 | 0.446 | |
0.014 | 0.010 | 0.029 | 0.011 | 0.007 | 0.001 | 0.003 | 0.005 | 0.576 | 0.573 | 0.572 | 0.571 | 0.400 | 0.398 | 0.398 | 0.399 | |
0.039 | 0.021 | 0.026 | 0.026 | 0.023 | 0.008 | 0.014 | 0.015 | 0.639 | 0.638 | 0.636 | 0.635 | 0.447 | 0.446 | 0.445 | 0.446 | |
Diff | 0.013 | 0.025 | 0.008 | 0.039 | 0.010 | 0.007 | 0.018 | 0.012 | 0.907 | 0.901 | 0.900 | 0.902 | 0.632 | 0.630 | 0.630 | 0.631 |
Uniform | 0.020 | 0.011 | 0.006 | 0.010 | 0.004 | 0.005 | 0.006 | 0.014 | 0.611 | 0.609 | 0.608 | 0.608 | 0.427 | 0.426 | 0.426 | 0.426 |
Panel C: Oracle | ||||||||||||||||
-0.010 | -0.003 | -0.001 | -0.003 | -0.006 | -0.001 | -0.007 | -0.004 | 0.633 | 0.631 | 0.629 | 0.633 | 0.445 | 0.443 | 0.443 | 0.444 | |
0.001 | 0.003 | 0.022 | 0.006 | 0.002 | -0.003 | -0.002 | 0.004 | 0.572 | 0.569 | 0.568 | 0.568 | 0.397 | 0.396 | 0.396 | 0.396 | |
0.022 | 0.008 | 0.020 | 0.022 | 0.018 | 0.001 | 0.009 | 0.013 | 0.637 | 0.635 | 0.633 | 0.632 | 0.446 | 0.445 | 0.443 | 0.445 | |
Diff | 0.017 | 0.026 | 0.013 | 0.040 | 0.008 | 0.009 | 0.016 | 0.014 | 0.900 | 0.894 | 0.894 | 0.895 | 0.629 | 0.628 | 0.627 | 0.628 |
Uniform | 0.007 | 0.002 | -0.002 | 0.004 | -0.001 | 0.000 | 0.000 | 0.011 | 0.607 | 0.604 | 0.603 | 0.604 | 0.425 | 0.424 | 0.424 | 0.424 |
Appendix D Additional Notation
Throughout the supplement the collection denotes an i.i.d. sequence with marginal distribution equal to the conditional distribution of given . In addition, are independent across and with . We further denote as a generic class of functions which differs in different contexts. The envelope of is denoted as . We say is of VC-type with coefficients if
where denote the covering number, , and the supremum is taken over all finitely discrete probability measures.
Appendix E Proof of Theorem 3
Name Description For , is the number of individuals with in stratum For , is the number of individuals in stratum For , For and , is the regression-adjusted estimator of with a generic regression adjustment For , , , and , is the true specification For , , , and , is the model for specified by researchers For , , and , For , , and , For , , and ,, For , denotes the density of For , denotes the imbalance in stratum For , , , and ,
We first derive the linear expansion of . By Knight’s identity ((Knight, 1998)), we have
where
and
By change of variables, we have
Note that is exactly the same as that considered in the proof of Theorem 3.2 in Zhang and Zheng (2020) and by their result we have
Next, consider . Denote , , and
First, note that . Therefore,
(E.1) |
where
In addition, note that
is of the VC-type with fixed coefficients and bounded envelope, and . Therefore, Lemma N.2 implies
By Assumption 1 we have , , and , which imply .
Next, denote . Then
(E.2) |
where the second equality holds because
For the first term of , we have
Assumption 3 implies
is of the VC-type with fixed coefficients and an envelope such that for . Therefore,
It is also assumed that and . Therefore, we have
Recall . Then, for the second term of , we have
where the last equality holds by Assumption 3(i). Therefore, we have
Combining (E.1) and (E.2), we have
Note by Assumption 3 that the classes of functions
and
are of the VC-type with fixed coefficients and envelopes belonging to . In addition,
and
Therefore, Lemma N.2 implies,
and
This implies . Then by Kato (2009, Theorem 2), we have
where . Similarly, we have
where and . Taking the difference of the above two displays gives
where . Lemma N.3 shows that, uniformly over ,
where is a Gaussian process with covariance kernel
For the second result in Theorem 3, we denote
(E.3) |
Then
and
Let
which does not rely on the working models. Then,
where
Further, denote , the asymptotic variance covariance matrix of as , and the optimal variance covariance matrix as . We have
which is positive semidefinite. In addition, if for , , and in the joint support of . This concludes the proof.
Appendix F Proof of Theorem 5
Name Description For and , , where is the bootstrap weights For , For , For and , is the bootstrap estimator of with a generic regression adjustment
We focus on deriving the linear expansion of Let
where
and
By change of variables, we have
Note that is exactly the same as that considered in the proof of Theorem 3.2 in Zhang and Zheng (2020) and by their result we have
Next consider . Recall and . Denote
First, note that
(F.1) |
where ,
Note that
is of the VC-type with fixed coefficients and the envelope , and
We can also let for some constant . Then, Lemma N.2 implies
In addition, Lemma N.4 implies , which further implies . Combining these results, we have
Next, recall . Then
(F.2) |
where the second equality holds because
For the first term in , we have
where the last equality holds due to Lemmas N.2 and N.4, and the fact that is of the VC-type with fixed coefficients and envelope such that for
For the second term in , recall . Then
where the last equality holds by Assumption 3. Therefore, we have
Combining (F.1) and (F.2), we have
where . In addition, Assumption 3 implies that the classes of functions
and
are of the VC-type with fixed coefficients and envelopes belonging to . In addition,
and
Appendix G Proof of Theorem 5.1
Name Description For , For , , , and , is a parametric model for with a pseudo true value For , , , is a consistent estimator of
The proof is divided into two steps. In the first step, we show Assumption 5. Assumption 3(i) can be shown in the same manner and is omitted. In the second step, we establish Assumptions 3(ii) and 3(iii).
Step 1. Recall , ,
and is generated independently from the joint distribution of given , and so is independent of . Let . We have
(G.1) |
To see the last equality, we note that, for any , with probability approaching one (w.p.a.1), we have
Therefore, on the event
we have
where and
By Assumption 6, is a VC-class with a fixed VC index and envelope . In addition,
Therefore, for any we have
By Chernozhukov et al. (2014, Corollary 5.1),
Therefore,
By letting followed by , we have
In addition,
as Lemma N.4 shows that .
Appendix H Proof of Theorem 5.2
Name Description For , , and , is the linear regressor in the linear adjustment so that For , , and , For , , and , is the pseudo true value in the linear adjustment
Let be the asymptotic covariance matrix of and with a linear adjustment and pseudo true values . Then, we have
where
To minimize (in the matrix sense) is the same as minimizing
for each , which is achieved if
(H.1) |
Because for , (H.1) implies
or equivalently,
This concludes the proof.
Appendix I Proof of Theorem 5.3
Name Description For and , is the th quantile of For , , and , is the optimal linear coefficient For , , , , For , , and , is defined in (5.7) For and , is the estimator of without any adjustments
Assumption 6(i) holds by Assumption 7. In addition, by Assumption 2, we have . This implies Assumption 6(ii). Next, we aim to show
Focusing on we have
(I.1) |
For the first term in (I.1), we have
where and is i.i.d. across with common distribution equal to the conditional distribution of given and independent of . Therefore, by Assumption 9, we have
where denotes the Frobenius norm and . For the second term in (I.1), we have
where
and
By Assumption 7 we can show that for some constant . Therefore, we have
and
where we use the fact that, by Assumption 9,
Next, note that , which means for any , there exists a constant such that with probability greater than . On the event set that , we have
where the first inequality is due to the triangle inequality, the second inequality is due to the fact that , and the third inequality is due to the fact that is assumed to be bounded. To see the last equality in the above display, we define
with envelope for some , where is the th coordinate of . Clearly is of the VC-type with fixed coefficients . In addition,
Therefore, Lemma N.2 implies that . By the usual maximal inequality (e.g. (van der Vaart and Wellner, 1996), Theorem 2.14.1), we can show that
Combining these results, we conclude that
and hence
Appendix J Proof of Theorem 5.4
Name Description For , for some function For and , is the pseudo true value defined in (5.9) For and , is the estimator of in (5.8)
Let be the dimension of , , and
and
We note that
is a VC class with a fixed VC index. Then, Lemma N.2 implies
In addition,
and . Therefore,
(J.1) |
Further note that is concave in for fixed . Therefore, for where , and
which implies
Because is continuous in , is compact, and is the unique maximizer of , we have
for some . In addition, if , then there exists such that
Therefore,
where the last step is due to (J.1). This implies
Appendix K Proof of Theorem 5.5
Name Description Logistic CDF For , , and , is the pseudo true value defined in (5.11) For , , and , is the estimator of in (5.15) For , , , and , For , , and , For , , , , and , For , , , and , For , , , and , For , , and , For , , , , and , For , , , and ,
Recall and . Let be the dimension of . Then, we have
where the functional form is invariant to ,
Suppose
(K.1) |
and we also have
by Theorem 5.4. Then Assumption 6(iii) holds for . Assumption 6(i) holds automatically as does not depend on , and Assumption 6(ii) holds by Assumption 11. Then, Theorem 5.1 implies Assumptions 3 and 5 hold. In addition, Theorem 5.2 implies is the smallest among all linear adjustments with with as the regressors.
Therefore, the only thing left is to establish (K.1). First, denote
We note that Assumption 9 holds with by Assumption 11(ii). Then, following the same argument as in the proof of Theorem 5.3, we can show that
Therefore, it suffices to show
Denote and . We have
In addition, denote and . We first consider the case . We have
In addition, Assumption 11 implies . Therefore, we have
Similarly, we have
Last,
This implies
and thus,
In addition,
Therefore, we have
which concludes the proof.
Appendix L Proof of Theorem 5.6
Name Description For , , and , is the pseudo true value defined in Assumption 12(ii) For , , and , is the estimator of in (5.17)
The proof strategy follows Belloni et al. (2017) and details are given here for completeness. We divide the proof into three steps. In the first step, we show
In the second step, we establish Assumption 5. By a similar argument, we can establish Assumption 3(i). In the third step, we establish Assumptions 3(ii) and 3(iii).
Step 1. Let ,
and for an arbitrary ,
Then, we have
and
In addition
Therefore, there exists a constant such that
where the first inequality is due to Bach (2010, Lemma 1) and the third inequality holds because
To see the second inequality, note that for and by Assumption 12,
and
This implies
and thus,
Let
(L.1) |
If , then
and thus
On the other hand, if , we can denote such that
Further, because is convex in we have
Therefore, for some constant that only depends on and , we have
(L.2) |
In addition, by construction,
(L.3) |
Combining (L.2) and (L.3), we have
Taking on both sides, we have
where the last line holds due to Assumption 12 and Lemma N.6. Finally, Lemma N.7 shows that , which implies
Step 2. Recall
and is generated independently from the joint distribution of given , and so is independent of . Let
We have
(L.4) |
We aim to bound the first term on the RHS of (L.4). Note for any , there exists a constant such that
On the set , we have
For , we have
where the supremum in the first equality is taken over and
with the envelope . We further note that ,
and
where are two fixed constants. Therefore, by Chernozhukov et al. (2014, Corollary 5.1),
which implies .
Appendix M Proof of Theorem A.1
Name Description High-dimensional regressor constructed based on with dimension For , , and , is the pseudo true value defined in Assumption 13(i) For , , and , is the estimator of in (A.1) Lasso penalty defined after (A.1) Lasso penalty loading matrix defined after (A.1) For , , , and ,
We focus on the case with . Note
where is an i.i.d. sequence that is independent of . Therefore,
and Assumption 13(vi) implies
and
In addition, we have . Therefore, based on the results established by Belloni et al. (2017), we have, conditionally on , and thus, unconditionally,
and
In the following, we prove the results when is used. The results corresponding to can be proved in the same manner and are therefore omitted. Recall
where
Let
and
Then, we have
(M.1) |
We aim to bound the first term on the RHS of (M.1). Note for any , there exists a constant such that
On the set
we have
where the first supremum in the second inequality is taken over . Denote
with the envelope . We further note that ,
and
where are two fixed constants. Therefore, Lemma N.2 implies
Similarly, denote
with an envelope . In addition, note that is nested in
with the same envelope. Hence,
Last,
Therefore, Lemma N.2 implies
This leads to (M.1). We can establish Assumption 3(i) in the same manner. Assumptions 3(ii) and 3(iii) can be established by the same argument used in Step 3 of the proof of Theorem 5.6. This concludes the proof of Theorem A.1.
Appendix N Technical Lemmas
The first lemma was established in Zhang and Zheng (2020).
Lemma N.1.
Let be the -th partial sum of Banach space valued independent identically distributed random variables, then
Proof.
The next lemma is due to Chernozhukov et al. (2014) with our modification of their maximal inequality to the case with covariate-adaptive randomization.
Lemma N.2.
Proof.
We focus on establishing the first statement. The proof of the second statement is similar and is omitted. Following Bugni et al. (2018), we define the sequence of i.i.d. random variables with marginal distributions equal to the distribution of . The distribution of is the same as the counterpart with units ordered by strata and then ordered by first and second within each stratum, i.e.,
where and
Let Then, for some constant , we have
where and are the empirical process and expectation w.r.t. i.i.d. data , respectively, the second inequality is due to Lemma N.1, the last equality is due to the fact that
and the last inequality is due to the fact that, by Chernozhukov et al. (2014, Corollary 5.1),
Then, for any , we can choose so that
which implies the desired result. ∎
The next lemma is similar to Zhang and Zheng (2020, Lemma E.2) but with additional covariates and regression adjustments. It is retained in the Supplement to make the paper self-contained.
Lemma N.3.
Suppose Assumptions in Theorem 3 hold. Denote
and
Then, uniformly over ,
where are two independent Gaussian processes with covariance kernels and , respectively, such that
and
Proof.
We follow the general argument in the proof of Bugni et al. (2018, Lemma B.2). We divide the proof into two steps. In the first step, we show that
where the term holds uniformly over , , and, uniformly over ,
In the second step, we show that
uniformly over .
Step 1. Recall that we define as a sequence of i.i.d. random variables with marginal distributions equal to the distribution of and . The distribution of is the same as the counterpart with units ordered by strata and then ordered by first and second within each stratum, i.e.,
where
with
and
As is only a function of , we have
Let , , and
Note is a function of , which is independent of by construction. Therefore,
Note that
Denote for . In order to show that and , it suffices to show that (1) for and , the stochastic process
is stochastically equicontinuous and (2) converges to in finite dimension.
Claim (1). We want to bound
where the supremum is taken over and such that Note that,
(N.1) |
Then, for an arbitrary , by taking , we have
where in the first inequality, and the second inequality holds due to Lemma N.1. To see the third inequality, denote
with an envelope function such that by Assumption 3, . In addition, by Assumption 3 again and the fact that
is of the VC-type with fixed coefficients , so is . Then, we have
where
is the covering number, and the supremum is taken over all discrete probability measures . Therefore, by van der Vaart and Wellner (1996, Theorem 2.14.1)
For the second term on the RHS of (N), by taking , we have
where in the first equality, and the first inequality is due to Lemma N.1. To see the last inequality, denote
with a constant envelope function such that . In addition, due to Assumptions 2.2 and 3.3, one can show that
for some constant . Last, due to Assumption 3.2, is of the VC-type with fixed coefficients . Therefore, by Chernozhukov et al. (2014, Corollary 5.1),
where the last inequality holds by letting be sufficiently large. Note that as . This concludes the proof of Claim (1).
Claim (2). For a single , by the triangular array CLT,
where
Finite dimensional convergence is proved by the Cramér-Wold device. In particular, we can show that the covariance kernel is
This concludes the proof of Claim (2), and thereby leads to the desired results in Step 1.
Step 2. As is Lipschitz continuous in with a bounded Lipschitz constant, is of the VC-type with fixed coefficients and a constant envelope function. Therefore, is a Donsker class and we have
where is a Gaussian process with covariance kernel
This concludes the proof. ∎
Lemma N.4.
Suppose the Assumptions in Theorem 5 hold and recall . Then, and .
Proof.
We note that and . Therefore, we only need to show
As a.s., given data,
Then, by the Lindeberg CLT, conditionally on data,
and thus
∎
Lemma N.5.
Suppose the Assumptions in Theorem 5 hold. Then, uniformly over ,
where is a Gaussian process with the covariance kernel
Proof.
We divide the proof into two steps. In the first step, we show the conditional stochastic equicontinuity of and . In the second step, we show the finite-dimensional convergence of conditional on data.
Step 1. Following the same idea in the proof of Lemma N.3, we define as a sequence of i.i.d. random variables with marginal distributions equal to the distribution of and . The distribution of is the same as the counterpart with units ordered by strata and then ordered by first and second within each stratum, i.e.,
and thus,
(N.2) |
where
In addition, let
Following exactly the same argument as in the proof of Lemma N.3, we have
(N.3) |
and is unconditionally stochastically equicontinuous, i.e., for any , as followed by , we have
where means the probability operator is with respect to the bootstrap weights and is conditional on data. This implies the unconditional stochastic equicontinuity of due to (N.2) and (N.3), which further implies the conditional stochastic equicontinuity of , i.e., for any , as followed by ,
By a similar but simpler argument, the conditional stochastic equicontinuity of holds as well. This concludes the first step.
Step 2. We first show the asymptotic normality of conditionally on data for a fixed . Note
Conditionally on data, is a sequence of i.n.i.d. random variables. In order to apply the Lindeberg-Feller central limit theorem, we only need to show that (1)
where is defined in Theorem 3, and (2) the Lindeberg condition holds, i.e.,
For part (1), we have
where
and
Note
where the convergence holds since , , , and uniform convergence of the partial sum process. Similarly,
where we use the fact that
By the standard weak law of large numbers, we have
Therefore,
To verify the Lindeberg condition, we note that
where the last equality is due to Assumption 3(ii) and the fact that is bounded.
Finite dimensional convergence of across can be established in the same manner using the Cramér-Wold device and the details are omitted. By the same calculation as that given above the covariance kernel is shown to be
which concludes the proof. ∎
Lemma N.6.
Suppose the Assumptions in Theorem 5.6 hold. Then,
Proof.
We focus on . We have
(N.4) |
Define
and let be the -th coordinate of . For each , is of the VC-type with fixed coefficients and a common envelope , i.e.,
where the supremum is taken over all finitely discrete probability measures. This implies
i.e., is also of the VC-type with coefficients . In addition,
Then, Lemma N.2 implies
Lemma N.7.
Proof.
References
- Abadie et al. (2018) Abadie, A., M. M. Chingos, and M. R. West (2018). Endogenous stratification in randomized experiments. Review of Economics and Statistics 100(4), 567–580.
- Anderson and McKenzie (2021) Anderson, S. J. and D. McKenzie (2021). Improving business practices and the boundary of the entrepreneur: a randomized experiment comparing training, consulting, insourcing and outsourcing. Journal of Political Economy, 130(1), 157–209.
- Ansel et al. (2018) Ansel, J., H. Hong, and J. Li (2018). OLS and 2SLS in randomised and conditionally randomized experiments. Journal of Economics and Statistics 238, 243–293.
- Athey and Imbens (2017) Athey, S. and G. W. Imbens (2017). The econometrics of randomized experiments. In Handbook of Economic Field Experiments, Volume 1, pp. 73–140. Elsevier.
- Bach (2010) Bach, F. (2010). Self-concordant analysis for logistic regression. Electronic Journal of Statistics 4, 384–414.
- Bai (2020) Bai, Y. (2020). Optimality of matched-pair designs in randomized controlled trials. Available at SSRN 3483834.
- Bai et al. (2021) Bai, Y., A. Shaikh, and J. P. Romano (2021). Inference in experiments with matched pairs. Journal of the American Statistical Association, forthcoming.
- Banerjee et al. (2015) Banerjee, A., E. Duflo, R. Glennerster, and C. Kinnan (2015). The miracle of microfinance? Evidence from a randomized evaluation. American Economic Journal: Applied Economics 7(1), 22–53.
- Belloni et al. (2017) Belloni, A., V. Chernozhukov, I. Fernández-Val, and C. Hansen (2017). Program evaluation with high-dimensional data. Econometrica 85(1), 233–298.
- Bitler et al. (2006) Bitler, M. P., J. B. Gelbach, and H. W. Hoynes (2006). What mean impacts miss: distributional effects of welfare reform experiments. American Economic Review 96(4), 988–1012.
- Bloniarz et al. (2016) Bloniarz, A., H. Liu, C.-H. Zhang, J. S. Sekhon, and B. Yu (2016). Lasso adjustments of treatment effect estimates in randomized experiments. Proceedings of the National Academy of Sciences 113(27), 7383–7390.
- Box (1976) Box, G. E. (1976). Science and statistics. Journal of the American Statistical Association 71(356), 791–799.
- Bradic et al. (2019) Bradic, J., S. Wager, and Y. Zhu (2019). Sparsity double robust inference of average treatment effects. arXiv preprint arXiv: 1905.00744.
- Bugni et al. (2018) Bugni, F. A., I. A. Canay, and A. M. Shaikh (2018). Inference under covariate-adaptive randomization. Journal of the American Statistical Association 113(524), 1741–1768.
- Bugni et al. (2019) Bugni, F. A., I. A. Canay, and A. M. Shaikh (2019). Inference under covariate-adaptive randomization with multiple treatments. Quantitative Economics 10(4), 1747–1785.
- Bugni and Gao (2021) Bugni, F. A. and M. Gao (2021). Inference under covariate-adaptive randomization with imperfect compliance. arXiv preprint arXiv: 2102.03937.
- Burchardi et al. (2019) Burchardi, K. B., S. Gulesci, B. Lerva, and M. Sulaiman (2019). Moral hazard: experimental evidence from tenancy contracts. Quarterly Journal of Economics 134(1), 281–347.
- Campos et al. (2017) Campos, F., M. Frese, M. Goldstein, L. Iacovone, H. C. Johnson, D. McKenzie, and M. Mensmann (2017). Teaching personal initiative beats traditional training in boosting small business in west africa. Science 357(6357), 1287–1290.
- Chen (2007) Chen, X. (2007). Large sample sieve estimation of semi-nonparametric models. In Handbook of Econometrics, Volume 6, pp. 5549–5632. Elsevier.
- Chernozhukov et al. (2014) Chernozhukov, V., D. Chetverikov, and K. Kato (2014). Gaussian approximation of suprema of empirical processes. Annals of Statistics 42(4), 1564–1597.
- Chernozhukov et al. (2013) Chernozhukov, V., I. Fernández-Val, and B. Melly (2013). Inference on counterfactual distributions. Econometrica 81(6), 2205–2268.
- Chong et al. (2016) Chong, A., I. Cohen, E. Field, E. Nakasone, and M. Torero (2016). Iron deficiency and schooling attainment in Peru. American Economic Journal: Applied Economics 8(4), 222–55.
- Cohen and Fogarty (2020) Cohen, P. L. and C. B. Fogarty (2020). No-harm calibration for generalized oaxaca-blinder estimators. arXiv preprint arXiv:2012.09246.
- Crépon et al. (2015) Crépon, B., F. Devoto, E. Duflo, and W. Parienté (2015). Estimating the impact of microcredit on those who take it up: evidence from a randomized experiment in Morocco. American Economic Journal: Applied Economics 7(1), 123–50.
- Duflo et al. (2013) Duflo, E., M. Greenstone, R. Pande, and N. Ryan (2013). Truth-telling by third-party auditors and the response of polluting firms: experimental evidence from India. Quarterly Journal of Economics 128(4), 1499–1545.
- Dupas et al. (2018) Dupas, P., D. Karlan, J. Robinson, and D. Ubfal (2018). Banking the unbanked? evidence from three countries. American Economic Journal: Applied Economics 10(2), 257–297.
- Firpo (2007) Firpo, S. (2007). Efficient semiparametric estimation of quantile treatment effects. Econometrica 75(1), 259–276.
- Fogarty (2018) Fogarty, C. B. (2018). Regression-assisted inference for the average treatment effect in paired experiments. Biometrika 105(4), 994–1000.
- Freedman (2008a) Freedman, D. A. (2008a). On regression adjustments in experiments with several treatments. Annals of Applied Statistics 2(1), 176–196.
- Freedman (2008b) Freedman, D. A. (2008b). On regression adjustments to experimental data. Advances in Applied Mathematics 40(2), 180–193.
- Greaney et al. (2016) Greaney, B. P., J. P. Kaboski, and E. Van Leemput (2016). Can self-help groups really be “self-help”? Review of Economic Studies 83(4), 1614–1644.
- Hahn et al. (2011) Hahn, J., K. Hirano, and D. Karlan (2011). Adaptive experimental design using the propensity score. Journal of Business & Economic Statistics 29(1), 96–108.
- Hahn and Liao (2021) Hahn, J. and Z. Liao (2021). Bootstrap standard error estimates and inference. Econometrica 89(4), 1963–1977.
- Hirano et al. (2003) Hirano, K., G. W. Imbens, and G. Ridder (2003). Efficient estimation of average treatment effects using the estimated propensity score. Econometrica 71(4), 1161–1189.
- Hu and Hu (2012) Hu, Y. and F. Hu (2012). Asymptotic properties of covariate-adaptive randomization. Annals of Statistics 40(3), 1794–1815.
- Jakiela and Ozier (2016) Jakiela, P. and O. Ozier (2016). Does Africa need a rotten kin theorem? Experimental evidence from village economies. Review of Economic Studies 83(1), 231–268.
- Jiang et al. (2021) Jiang, L., X. Liu, P. C. B. Phillips, and Y. Zhang (2021). Bootstrap inference for quantile treatment effects in randomized experiments with matched pairs. Review of Economics and Statistics, forthcoming.
- Kallus et al. (2020) Kallus, N., X. Mao, and M. Uehara (2020). Localized debiased machine learning: efficient inference on quantile treatment effects and beyond. arXiv preprint arXiv: 1912.12945.
- Karlan et al. (2014) Karlan, D., A. L. Ratan, and J. Zinman (2014). Savings by and for the poor: A research review and agenda. Review of Income and Wealth 60(1), 36–78.
- Kato (2009) Kato, K. (2009). Asymptotics for argmin processes: convexity arguments. Journal of Multivariate Analysis 100(8), 1816–1829.
- Knight (1998) Knight, K. (1998). Limiting distributions for regression estimators under general conditions. Annals of Statistics 26(2), 755–770.
- Lei and Ding (2021) Lei, L. and P. Ding (2021). Regression adjustment in completely randomized experiments with a diverging number of covariates. Biometrika, 108(4), 815–828.
- Li and Ding (2020) Li, X. and P. Ding (2020). Rerandomization and regression adjustment. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 82(1), 241–268.
- Lin (2013) Lin, W. (2013). Agnostic notes on regression adjustments to experimental data: reexamining Freedman’s critique. Annals of Applied Statistics 7(1), 295–318.
- Liu et al. (2020) Liu, H., F. Tu, and W. Ma (2020). A general theory of regression adjustment for covariate-adaptive randomization: OLS, Lasso, and beyond. arXiv preprint arXiv: 2011.09734.
- Liu and Yang (2020) Liu, H. and Y. Yang (2020). Regression-adjusted average treatment effect estimates in stratified randomized experiments. Biometrika 107(4), 935–948.
- Lu (2016) Lu, J. (2016). Covariate adjustment in randomization-based causal inference for 2K factorial designs. Statistics & Probability Letters 119, 11–20.
- Ma et al. (2015) Ma, W., F. Hu, and L. Zhang (2015). Testing hypotheses of covariate-adaptive randomized clinical trials. Journal of the American Statistical Association 110(510), 669–680.
- Ma et al. (2020) Ma, W., Y. Qin, Y. Li, and F. Hu (2020). Statistical inference for covariate-adaptive randomization procedures. Journal of the American Statistical Association 115(531), 1488–1497.
- Montgomery-Smith (1993) Montgomery-Smith, S. J. (1993). Comparison of sums of independent identically distributed random variables. Probability and Mathematical Statistics 14(2), 281–285.
- Muralidharan and Sundararaman (2011) Muralidharan, K. and V. Sundararaman (2011). Teacher performance pay: experimental evidence from India. Journal of Political Economy 119(1), 39–77.
- Negi and Wooldridge (2020) Negi, A. and J. M. Wooldridge (2020). Revisiting regression adjustment in experiments with heterogeneous treatment effects. Econometric Reviews 40(5), 1–31.
- Olivares (2021) Olivares, M. (2021). Robust permutation test for equality of distributions under covariate-adaptive randomization. Working paper, University of Illinois at Urbana Champaign.
- Shao and Yu (2013) Shao, J. and X. Yu (2013). Validity of tests under covariate-adaptive biased coin randomization and generalized linear models. Biometrics 69(4), 960–969.
- Shao et al. (2010) Shao, J., X. Yu, and B. Zhong (2010). A theory for testing hypotheses under covariate-adaptive randomization. Biometrika 97(2), 347–360.
- Tabord-Meehan (2021) Tabord-Meehan, M. (2021). Stratification trees for adaptive randomization in randomized controlled trials. arXiv preprint arXiv: 1806.05127.
- Tan (2020) Tan, Z. (2020). Model-assisted inference for treatment effects using regularized calibrated estimation with high-dimensional data. Annals of Statistics 48(2), 811–837.
- van der Vaart and Wellner (1996) van der Vaart, A. and J. A. Wellner (1996). Weak Convergence and Empirical Processes. Springer, New York.
- von Neumann (2019) von Neumann, J. (2019). The mathematician. In Mathematics: People, Problems, Results (2 ed.). Chapman and Hall/CRC.
- Wei (1978) Wei, L. (1978). An application of an urn model to the design of sequential controlled clinical trials. Journal of the American Statistical Association 73(363), 559–563.
- Ye (2018) Ye, T. (2018). Testing hypotheses under covariate-adaptive randomisation and additive models. Statistical Theory and Related Fields 2(1), 96–101.
- Ye and Shao (2020) Ye, T. and J. Shao (2020). Robust tests for treatment effect in survival analysis under covariate-adaptive randomization. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 82(5), 1301–1323.
- Ye et al. (2022) Ye, T., Y. Yi, and J. Shao (2022). Inference on average treatment effect under minimization and other covariate-adaptive randomization methods. Biometrika, 109(1), 33–47.
- Zhang and Zheng (2020) Zhang, Y. and X. Zheng (2020). Quantile treatment effects and bootstrap inference under covariate-adaptive randomization. Quantitative Economics 11(3), 957–982.
- Zhao and Ding (2021) Zhao, A. and P. Ding (2021). Covariate-adjusted fisher randomization tests for the average treatment effect. Journal of Econometrics 225(2), 278–294.