\zxrsetup

toltxlabel \zexternaldocument*[supp:]sm

The causal effects of modified treatment policies under network interference

Salvador V. Balkus
Department of Biostatistics,
Harvard T.H. Chan School of Public Health
[email protected] &Scott W. Delaney
Department of Environmental Health,
Harvard T.H. Chan School of Public Health
[email protected] &Nima S. Hejazi^†
Department of Biostatistics,
Harvard T.H. Chan School of Public Health
[email protected]

Abstract

Modified treatment policies are a widely applicable class of interventions used to study the causal effects of continuous exposures. Approaches to evaluating their causal effects assume no interference, meaning that such effects cannot be learned from data in settings where the exposure of one unit affects the outcome of others, as is common in spatial or network data. We introduce a new class of intervention—induced modified treatment policies—which we show identify such causal effects in the presence of network interference. Building on recent developments in network causal inference, we provide flexible, semi-parametric efficient estimators of the identified statistical estimand. Simulation experiments demonstrate that an induced modified treatment policy can eliminate causal (or identification) bias resulting from interference. We use the methods developed to evaluate the effect of zero-emission vehicle uptake on air pollution in California, strengthening prior evidence.

⁰⁰footnotetext: † To whom correspondence should be addressed.

1 Introduction

Causal inference methodology commonly assumes no interference, that is, that one unit’s exposure or level does not impact any other units’ outcome (cox1958planning; rubin1980). Oftentimes, though, studies involve units that interact with one another, violating this assumption. This occurs, for instance, in settings where each study unit corresponds to a geographic area, such as a county or ZIP code area. Units often move around—as such, application of an exposure policy to a given county affects not only those residing in that region but also those commuting to and from there.

Drawing reliable conclusions about the effects of exposure policies in such settings, common as they are, necessitates the use of methodology designed to answer causal questions such as, “if one intervened upon a population via exposure $A=a$ , how would that have changed the outcome $Y$ ?” (rubin2005). Ignoring interference between units when estimating causal effects risks invalid identification and overtly biased results (halloranDependentHappeningsRecent2016). Despite these challenges, data from studies that involve interference between study units can be exceedingly useful, even critical, to advancing science and policy in some settings. For instance, environmental epidemiologists may leverage observational spatial data to study the effects of pollution on human health, a case where randomized experiments are often impractical or unethical (elliottSpatialEpidemiologyCurrent2004; reichReviewSpatialCausal2021a; morrisonDefiningSpatialEpidemiology).

However, the current state of analysis techniques in such studies fails to reliably answer questions of causality while appropriately addressing issues posed by interference. As interference frequently occurs in data generated by complex studies with a large number of variables that may confound the treatment–outcome relationship, a promising strategy is to use the large set of measured confounders to account for underlying interference by way of regression adjustment. Doing so requires investigators to employ flexible regression approaches to learn the unknown relationships while relying upon minimal assumptions that are likely to render downstream inference invalid. Compounding this issue, many studies also seek to answer questions about the effects of continuous (or quantitative) exposures; these pose their own challenges for identification and estimation, including possible violations of the positivity (or common support) assumption and construction of asymptotically efficient estimators. The present work aims to provide methods that can obtain inference about the causal effect of a continuous exposure while both accounting for network interference between units and using flexible regression procedures in the estimation process.

Modified Treatment Policies (MTPs) (robins2004; munozPopulationInterventionCausal2012; haneuseEstimationEffectInterventions2013; young2014identification) are a class of interventions that define causal estimands well-suited for formulating counterfactual questions about continuous exposures. An MTP can answer the scientific question, “how much would the average value of $Y$ have changed had the natural value of $A$ been shifted by an increment $\delta$ ?” (where $\delta$ is chosen by the investigator).

Such questions are often as scientifically relevant than the analogous question answered by the causal dose-response curve—namely, “what would the average value of $Y$ be had every unit been set to the same level of the treatment $A=a$ for all $A\in\mathcal{A}$ ?”—while being easier to pose and answer. Studies on populations exhibiting interference are frequently conducted precisely because considering setting every unit’s hypothetical exposure to the same level would be unrealistic in practice, making the causal dose-response curve’s assumption of common support more restrictive than necessary. The MTP framework has gained traction for its applicability in settings involving longitudinal, time-varying interventions (diazNonparametricCausalEffects2023; hoffman2024studying); causal mediation analysis (diaz2020causal; hejazi2023nonparametric), including with time-varying mediator–confounder feedback (gilbert2024identification); and causal survival analysis under competing risks (diaz2024causal). However, no work has, to our knowledge, extended the MTP framework to dependent data settings characterized by interference between units.

Contributions. This work presents an inferential framework for estimating the causal effect of an MTP under known network interference (that is, where the interference structure is known to the investigator). First, we introduce the concept of an induced MTP, a new type of intervention that identifies the causal effect of an MTP when network interference is present; this is necessary to account for how the application of an MTP to a given unit affects its neighbors in the network. Next, we prove necessary and sufficient conditions for the induced MTP to yield an identifiable estimand. Then, adapting recent theoretical tools for network causal inference (ogburnCausalInferenceSocial2022), we describe consistent and semi-parametric efficient estimation strategies. Finally, we evaluate how the use of an induced MTP to correct for interference can improve on classical techniques, in both numerical experiments and in a data analysis studying the effect of zero-emission vehicle uptake in California on NO₂ air pollution. The tools we describe allow MTPs to be evaluated in previously intractable settings, including with spatial data and social networks.

Outline. Section 2 reviews MTPs and causal inference under interference. Section 3 describes the induced MTP, including identification of causal effects as well as point and variance estimation. Section LABEL:section:sim-results reports numerical results verifying the proposed methodology, while Section LABEL:section:data-analysis demonstrates application in our motivating data analysis. We conclude and discuss future directions in Section LABEL:section:discussion.

2 Background

Throughout, we let capital bold letters denote random $n$ -vectors; for instance, $\mathbf{Y}=(Y_{1},\ldots,Y_{n})$ . Consider data $\mathbf{O}=(\mathbf{L},\mathbf{A},\mathbf{Y})\sim\mathsf{P}\in\mathcal{M}$ , where $\mathsf{P}$ is the true and unknown data-generating distribution of $\mathbf{O}$ and $\mathcal{M}$ is a non-parametric statistical model (i.e., set of candidate data-generating probability distributions) that places no restrictions on the data-generating distribution; in principle, $\mathcal{M}$ may be restricted to incorporate any available real-world knowledge about the system under study. Let $O_{i}=(L_{i},A_{i},Y_{i})$ represent measurements on the i^th individual unit, where $Y_{i}$ is an outcome of interest, $A_{i}$ is a continuous exposure with support $\mathcal{A}$ , and $L_{i}$ is a collection of baseline (i.e., pre-exposure) covariates. To ease notational burden, we omit subscripts $i$ when referring to an arbitrary unit $i$ (i.e., $Y=Y_{i}$ when the specific unit index is not informative).

We will assume that the data-generating process can be expressed via a structural causal model (SCM, pearl2000models) encoding the temporal ordering between variables: $\mathbf{L}$ is generated first, then $\mathbf{A}$ , and finally $\mathbf{Y}$ . We denote by $Y(a)$ the counterfactual random variable generated by hypothetically intervening upon $A$ to set it to $a\in\mathcal{A}$ and observing its impact on the component of the SCM that generates $Y$ . Our goal will be to reason scientifically about the causal relationship between $\mathbf{A}$ and $\mathbf{Y}$ in spite of the presence of confounders $\mathbf{L}$ and network interference between units $i=1,\ldots,n$ .

2.1 Continuous Exposures with Modified Treatment Policies

A Modified Treatment Policy (MTP) is a user-specified function $d(a,l;\delta)$ that maps the observed value $a$ of an exposure $A$ to a new value and may itself depend on the natural (i.e., pre-intervention) value of the exposure (haneuseEstimationEffectInterventions2013).

Example 1 (Additive Shift).

For a fixed $\delta$ , an additive shift MTP may be defined as

d(a,l;\delta)=a+\delta\ .

(1)

This corresponds to the scientific question, “how much of a change in $Y$ would be caused by adding $\delta$ to the observed natural value of $a$ , for all units regardless of stratum $l$ ?”

Example 2 (Multiplicative Shift).

For a fixed $\delta$ , a multiplicative shift MTP is defined as

d(a,l;\delta)=\delta\cdot a\ .

(2)

This asks the scientific question “how much of a change in $Y$ would be caused by scaling the observed natural value of $a$ by $\delta$ , for all units regardless of stratum $l$ ?”

Note that in the above, $\delta$ is a fixed, user-specified parameter specifying the magnitude of the hypothetical intervention. The MTP framework can also consider interventions that change depending on values of measured covariates $L$ .

Example 3 (Piecewise Additive Shift).

One could propose as an MTP a piecewise additive function

d(a,l;\delta)=\begin{cases}a+\delta\cdot l&a\in\mathcal{A}(l)\\ a&\text{otherwise}\ ,\end{cases}

(3)

which applies an intervention whose scale depends on the value of a covariate $l$ and only occurs if $a$ is within some specific subset $\mathcal{A}(l)\subset\mathcal{A}$ of the support of $A$ .

MTPs can be used to define scientifically relevant causal estimands for continuous exposures. The population intervention causal effect of an MTP (munozPopulationInterventionCausal2012) is defined as $\mathbb{E}_{\mathsf{P}}(Y(d(a,l;\delta))-Y(a))$ ; that is, the average difference between the outcome $Y$ that did occur under the observed natural value of treatment $a$ , i.e., $Y=Y(a)$ , and the counterfactual outcome $Y(d(a,l;\delta))$ that would have occurred under the user-specified modified treatment policy $d(a,l;\delta)$ . This estimand answers the scientific question, “what would happen if we applied some policy on the study population that modified the existing exposure according to a rule encoded by $d(\cdot;\delta)$ ?”

MTPs are useful because they allow the investigator to specify a wide range of interpretable and realistic interventions that may be carried out in practice. In addition, MTPs may be seen as a non-parametric extension of widely used associational estimands. For example, the interpretation of the additive shift of Example 1 is a causally-informed analogue of the “risk difference” interpretation of a linear regression coefficient.

2.2 Causal Inference Under Interference

Interference occurs when, for a given unit $i$ , the outcome of interest $Y_{i}$ depends not only on its own assigned exposure $A_{i}$ but also upon the exposure $A_{j}$ of at least one other unit ( $j\neq i$ ). Formally, no interference assumes $Y_{i}(a)\mbox{$\perp\!\!\!\perp$}A_{j}$ for all $i\neq j$ . It is a component of the well-known stable unit treatment value assumption (SUTVA; rubin1980), commonly assumed for the purpose of identification of a wide variety of causal estimands, including those defined by MTPs (haneuseEstimationEffectInterventions2013; young2014identification).

Previous work has focused on settings exhibiting partial interference (hudgensCausalInferenceInterference2008; tchetgenCausalInferencePresence2012; halloranDependentHappeningsRecent2016), which occurs when units can be partitioned into clusters such that interference only occurs between units in the same cluster. Our work focuses instead on a broader setting, that of network interference (vanderlaanCausalInferencePopulation2014), which occurs when a unit’s outcome is subject to interference by other units’ exposures according to some arbitrary known network of relationships between units. When such interference is present, the data $\mathbf{O}$ includes an adjacency matrix or network profile, $\mathbf{F}$ , describing each units’ “friends” or neighboring units (sofryginSemiParametricEstimationInference2017).

aronowEstimatingAverageCausal2017 demonstrate how to identify a causal estimand when SUTVA is violated due to network interference. To do this, they use an exposure mapping: a function that maps the exposure assignment vector $\mathbf{A}$ to the exposure actually received by each unit. The exposure received is a function of a unit’s original exposure $\mathbf{A}$ and covariates $\mathbf{L}$ , including the network profile $\mathbf{F}$ . If the exposure mapping is correctly specified and consistent, then SUTVA is restored and causal effects subject to interference can be identified by replacing the original exposure with that which arises from the exposure mapping. ogburnCausalInferenceSocial2022 and vanderlaanCausalInferencePopulation2014 rely on similar logic to identify population causal effects from data exhibiting a causally dependent structure, doing so by constructing “summary functions” of neighboring units’ exposures.

Notably, ogburnCausalInferenceSocial2022 and vanderlaanCausalInferencePopulation2014 describe how to construct asymptotically optimal semi-parametric efficient estimators of causal effects of stochastic interventions in the network dependence setting. Stochastic interventions, which differ from MTPs, replace the natural value of exposure with a random draw from a user-specified counterfactual distribution (munozPopulationInterventionCausal2012). While the hypothetical exposure that results from this random draw is not guaranteed to match that which would result from an MTP, the two classes of interventions may be constructed to yield equivalent counterfactual means (young2014identification). Given the similarities between these intervention schemes, we build on previous authors’ recent theoretical developments to construct semi-parametric efficient estimators of the causal effects of MTP under network interference, thereby extending the range settings in which MTPs may be applied.

Other relevant works address interference under different assumptions: random networks (clarkCausalInferenceStochastic2024), multiple outcomes (shin2023), long-range dependence (tchetgentchetgenAutoGComputationCausalEffects2021), bipartite graphs (ziglerBipartiteInterferenceAir2023), and unknown network structure (ohnishi2022degree; hoshino2023causal). We focus on the setting described by ogburnCausalInferenceSocial2022, defined in Section 3, as their scientific goals most closely resemble those of MTPs.

3 Methodology

Let us first formally define the interference structure of interest. Suppose there exists a network describing whether two units are causally dependent with adjacency matrix $\mathbf{F}$ , where $F_{i}$ denotes the friends of unit $i$ . For each unit $i$ , a set of confounders $L_{i}$ is drawn, followed by a treatment $A_{i}$ based on a summary $L_{i}^{s}$ of its own and its friends’ confounders, and finally an outcome based on $L_{i}^{s}$ and a summary of its own and its friends’ treatment, $A_{i}^{s}$ . This data-generating process can be defined formally as the SCM in Equation (4):

L_{i}=f_{L}(\varepsilon_{L_{i}});A_{i}=f_{A}(L_{i}^{s},\varepsilon_{A_{i}});Y_{i}=f_{Y}(A_{i}^{s},L_{i}^{s},\varepsilon_{Y_{i}})

(4)

Following ogburnCausalInferenceSocial2022, we assume error vectors $(\varepsilon_{L_{1}},\ldots,\varepsilon_{L_{n}})$ , $(\varepsilon_{A_{1}},\ldots,\varepsilon_{A_{n}})$ , and $(\varepsilon_{Y_{1}},\ldots,\varepsilon_{Y_{n}})$ are independent of each other, with entries identically distributed and all $\varepsilon_{i}\mbox{$\perp\!\!\!\perp$}\varepsilon_{j}$ provided $\{i,j\}\not\subseteq F_{k},\forall\,\,k\in 1,\ldots,n$ ; that is, errors between units are independent provided that the units are neither directly connected nor share ties with a common node in the interference network represented by $\mathbf{F}$ .

Interference bias arises when the data arise from the SCM (4) but investigators wrongly assume that $f_{Y}$ is a function only of $A_{i}$ and $L_{i}$ , and not of $\{A_{j}\colon j\in F_{i}\}$ or $\{L_{j}\colon j\in F_{i}\}$ . Since interference violates the consistency rule (Pearl2010), commonly relied upon for identification of causal effects, ignoring its presence, even inadvertently, risks the drawing of unsound inferences. Under the SCM (4), identifiability of the causal effect of applying an exposure to all units $A_{j}\colon j\in F_{i}$ can be restored by controlling for all $L_{j}\colon j\in F_{i}$ directly or via dimension-reducing summaries (vanderlaanCausalInferencePopulation2014) of a unit’s friends’ confounders and exposures.

3.1 Induced Modified Treatment Policies

vanderlaanCausalInferencePopulation2014, aronowEstimatingAverageCausal2017 and ogburnCausalInferenceSocial2022 note that identifiability can be restored despite the presence of interference by basing inference on adjusting for $A^{s}$ instead of $A$ and $L^{s}$ instead of $L$ . This is, however, incompatible with the application of MTPs: we are forced to no longer consider the causal effect of $A$ under $d(\cdot;\delta)$ , but rather, the causal effect of $A^{s}$ after intervening on the upstream exposure via $d(\cdot;\delta)$ . To identify the causal effects of MTPs under interference, we now introduce a novel intervention scheme—the induced MTP.

Consider applying an MTP to the SCM (4), replacing the expsoure of each unit $A_{i}$ with $A_{i}^{\star}=d(A_{i},L_{i};\delta)$ . Under interference, we are interested in the causal effect of $A^{s}$ on $Y$ . Hence, the scientific question of interest is actually, “what if $A_{i}^{s}$ were replaced by some $A_{i}^{s,\star}=h(A_{i}^{s},L_{i}^{s})$ ?”, where $h$ maps the natural value of $A_{i}^{s}$ to the value that would have resulted had $d(\cdot;\delta)$ first been applied to $A_{i}$ and only then summarized. This process is illustrated in Figure 1. We denote by $h$ the induced MTP.

Figure 1: How an induced MTP

h

arises from mapping a natural summary measure

A^{s}

to the same summary after

A

is modified by the MTP

d(A_{i},L_{i}^{s};\delta)

From Figure 1, we can see that the induced MTP $h$ must satisfy $(s_{A}\circ h)(\mathbf{A},\mathbf{L};\delta)=(d\circ s_{A})(\mathbf{A},\mathbf{L};\delta)$ , where $\circ$ denotes the function composition operator. The counterfactual mean of an induced MTP is given by Equation (5):

\Psi_{n}(\mathsf{P})=\mathbb{E}_{\mathsf{P}}\Big{[}\frac{1}{n}\sum_{i=1}^{n}Y(h(a_{i}^{s},l_{i}^{s};\delta))\Big{]}\ .

(5)

This data-adaptive parameter will converge to the population counterfactual mean as $n\rightarrow\infty$ (Hubbard2016; ogburnCausalInferenceSocial2022). Use of such a parameter definition is necessary because we must condition on the single observation of the interference network at play. Under an induced MTP, interference no longer hampers identifiability because $A_{i}^{s}$ captures the contribution of all relevant units (i.e., a given unit $i$ and its friends) to each $Y_{i}$ . A population intervention effect is defined by subtracting $\mathbb{E}Y$ (munozPopulationInterventionCausal2012).

3.2 Identification

Let $\mathcal{A}^{s}$ denote the support of $A^{s}$ , and $\mathcal{L}^{s}$ the support of $L^{s}$ . In addition to the SCM (4), identification of $\Psi_{n}(\mathsf{P})$ by a statistical parameter $\psi_{n}$ requires the following assumptions:

A1Positivity.

If $(\textbf{a}^{s},\textbf{l}^{s})\in\text{supp}\{\mathbf{A}^{s},\mathbf{L}^{s}\}$ , then $(h(\textbf{a}^{s},\textbf{l}^{s}),\textbf{l}^{s};\delta),\mathbf{l}^{s})\in\text{supp}\{\mathbf{A}^{s},\mathbf{L}^{s}\}$