These authors contributed equally to this work.
[2]\fnmBojing \surLiao \equalcontThese authors contributed equally to this work.
1]\orgdivSchool of Business, \orgnameUniversity of Sydney, \orgaddress\citySydney, \postcode2000, \stateNSW, \countryAustralia 1]\orgnameWillow Inc, \orgaddress\citySydney, \postcode2000, \stateNSW, \countryAustralia
[2]\orgdivInstitute of Creativity and Innovation, \orgnameXiamen University, \orgaddress\streetNo.422 Siming South Road, \cityXiamen, \postcode361000, \stateFujian, \countryChina
Shifting the Paradigm: Estimating Heterogeneous Treatment Effects in the Development of Walkable Cities Design
Abstract
The transformation of urban environments to accommodate growing populations has profoundly impacted public health and well-being. This paper addresses the critical challenge of estimating the impact of urban design interventions on diverse populations. Traditional approaches, reliant on questionnaires and stated preference techniques, are limited by recall bias and capturing the complex dynamics between environmental attributes and individual characteristics. To address these challenges, we integrate Virtual Reality (VR) with observational causal inference methods to estimate heterogeneous treatment effects, specifically employing Targeted Maximum Likelihood Estimation (TMLE) for its robustness against model misspecification. Our innovative approach leverages VR-based experiment to collect data that reflects perceptual and experiential factors. The result shows the heterogeneous impacts of urban design elements on public health and underscore the necessity for personalized urban design interventions. This study not only extends the application of TMLE to built environment research but also informs public health policy by illuminating the nuanced effects of urban design on mental well-being and advocating for tailored strategies that foster equitable, health-promoting urban spaces.
keywords:
Causal inference, Informatics, Urban design, VR-based Experiment, Epidemiology1 Introduction
Urbanization has dramatically transformed environments and lifestyles, becoming a focal point for accommodating a growing population[1, 2]. It has been an intriguing topic for scholars to understand the intricate relationship between individual well-being and the socio-cultural environment since the 20th century [1]. Recent focus on urban design’s role in addressing global public health issues marks a resurgence of interest in this field[2, 3]. Research underscores impact of the built environment on health, mental wellness, and life quality, noting that neighborhood designs promoting physical activity, green space, and aesthetics are crucial for combating obesity, heart disease, and mental health issues[4, 5, 6, 7, 8]. Specifically, the benefits of walkable neighborhood for mental health and social well-being are well-documented, underscoring the importance of creating environments that support health and sustainable communities[9, 10, 3, 11].
However, estimating the potential effects of urban (built) environment interventions on diverse populations remains a complex challenge. Traditionally, researchers have utilized a variety of methodological approaches to explain the causal impact of the built environment on mental health outcomes. This broad spectrum of methodologies reflects the nuanced and complex nature of how environmental contexts influence subjective well-being. Among these approaches, questionnaires and surveys have been extensively employed to evaluate individuals’ perceptions, experiences, and self-reported mental health status, particularly in relation to the urban environments within their neighborhood. Such data is invaluable for assessing how personal interpretations of environmental factors correlate with mental health outcomes [12, 13].
Although these methods are instrumental in gathering large sets of subjective data, they are not without limitations. A significant concern associated with questionnaire and surveys is their susceptibility to recall bias [14]. Recall bias occurs because traditional tools depend heavily on the respondents’ ability to accurately remember and report past events or experiences. This type of bias can particularly skew the data when individuals may not accurately remember or may alter their recollections–intentionally or unintentionally –based on current feelings or misconceptions [15]. For example, a recent study by Li et al. (2022) highlights how recall bias can affect research findings, particularly in studies that rely on self-reported stressful incidents or overemphasize certain events depending on their current mood or recent experiences, thus distorting the true impact of the built environment on their psychological state [16].
To overcome the limitations inherent in transitional survey methods, the conjoint experiments and the stated preference method have been employed to elicit individuals’ preferences and hypothetical choices concerning urban built environment scenarios with varying levels of attributes [17]. By presenting respondents with hypothetical scenarios featuring different combinations of environmental characteristics, these methods can provide insights into the relative importance of various factors in decision-making processes concerning urban environments. For example, in a study by Zhao et al. (2022), participants were presented with a series of hypothetical residential scenarios differing in proximity to green spaces and types of amenities available [18]. The researchers used a conjoint analysis approach to quantify how each attribute influenced the participants’ preferences for one living scenario over another, revealing a strong preference for proximity to green spaces over other factors.
Nevertheless, while conjoint experiments and stated preference methods offer valuable insights, they also have notable limitations. One such limitation is their inability to fully account for the intricate interplay between environmental factors, such as the presence of green spaces and land use mix-diversity, and individual characteristics like socioeconomic status and physical health. Furthermore, these methods often do not capture actual behavioral responses in real-world settings, potentially leading to discrepancies between stated and actual preferences [18]. For instance, Hurtubia et al. (2021) explored the preferences for bike sharing stations in the perceptions of public spaces using stated preference methods [19]. While the results indicated a positive preference for presence of bike sharing, subsequent observational studies in the same urban areas showed a negative preference for disorganized dock-less bikes on sidewalks. It means that actual behavior might diverge from hypothetical choices due to factors not captured in the initial surveys, such as actual accessibility, perceived safety, and personal time constraints [14, 20].
To address these challenges, we propose a novel approach that integrates Virtual Reality (VR) technology with observational causal inference methods to estimate heterogeneous treatment effects of urban built environment interventions on mental health outcomes. The primary objective of causal inference is to quantify the impact of a specific intervention. This concept is grounded in the counterfactual framework initiated by Rubin [21]. A notable development in this field is the Targeted Maximum Likelihood Estimation (TMLE) [22], introduced by Van der Laan and Rubin in 2006. TMLE stands out from traditional methods by utilizing both the treatment-generating mechanism (propensity score model) and the outcome-generating mechanism to estimate the average treatment effect, making it doubly robust. This means it remains reliable even if either the exposure or outcome model is incorrectly specified, and it has been proven more effective than other methods like inverse probability of treatment weighting and propensity score matching, especially in cases of likely model misspecification.
Despite its growing recognition, the application of TMLE for Likert outcomes is scarce. Likert outcomes are common in sociology but are often simplified to latent groups for ease of analysis, which can lead to loss of information and variable statistical power. Understanding TMLE in the context of Likert outcomes is crucial due to these complexities. Most existing TMLE guides focus on simulated data, but this article aims to demonstrate TMLE using a real-world dataset, helping practitioners understand its application in practical scenarios and highlighting its differences from other commonly used methods. To bridge the gap between theory and practice, we present a case study that illustrates the nuances of applying TMLE to a real-world situation. In doing so, we provide a VR-based conjoint experiment example of how TMLE can be used to interpret Likert data within the rich, multifaceted domain of social research.
The VR-based conjoint experiment has revolutionized the way researchers investigate individual preferences and behavioral responses regarding urban environments [14, 18, 23]. This advanced methodology utilizes realistic three-dimensional simulations of neighborhoods, which engage participants in a highly immersive and interactive manner. By doing so, VR-based conjoint experiments provide a controlled and replicable platform for data collection, which is particularly adept at capturing the nuanced influence of perceptual and experiential factors on human behavior [24]. Traditional survey methods often fail to capture the full sensory and emotional responses that individuals might have to real-world environments [24, 14, 20, 18]. VR overcomes this limitation by providing a rich, multi-sensory experience that can include visual and auditory, thereby mimicking real-life experiences more closely [20]. This allows researchers to observe how these complex perceptual inputs influence decision-making and preference formation in a way that abstract surveys cannot [24, 20].
The VR-based conjoint experimental design is particularly suited to the application of TMLE, as it generates the type of complex, multi-layered data for which TMLE was developed to analyze. By leveraging the TMLE framework with a joint treatment approach (e.g., accounting for correlations between environmental factors), our methodology enables estimating individualized treatment effects while accounting for the complex interplay between multiple environmental factors and individual characteristics[25]. This approach not only enhances the accuracy of the resulting data analysis but also offers a more nuanced understanding of the effects of urban design interventions.
This study reveals meaningful insights, establishing that interventions such as land use mix (LM), open spaces (OS), block connectivity (BC), Road Size (RS), and green/trees (GT) exert a positive effect on Conditional Average Treatment Effect (CATE) values, with OS interventions exhibiting notable efficacy as standalone measures. The research highlights the complexity inherent to urban walkability and calls for a judicious interpretation of CATE values in light of possible confounding due to demographic variables. The evidence points to the conclusion that deliberate urban design improvements, especially those integrating green and open spaces, are instrumental in enhancing walkability. While LM and BC interventions lead to moderate gains, the impact of Road Size (RS) interventions invites reconsideration, as they may detract from walkability if applied without complementary measures. These pivotal findings advocate for a strategic urban design methodology that leverages intervention benefits to foster environments conducive to pedestrians.
Our study contributes significantly to both theoretical advancements and practical applications in this field. First, we expand the application of TMLE and joint treatment estimation to the innovative context of urban built environment interventions. This advancement enhances the field of causal inference, deepening our understanding of the nuanced relationships between the urban built environment and mental health outcomes. Second, our findings provide invaluable insights for urban design and public health policies. By facilitating the development of tailored interventions that cater to diverse population segments, our study optimizes resource allocation and maximizes positive impacts on mental well-being. Finally, our work lays the groundwork for creating equitable and sustainable urban built environments that promote holistic well-being for all. It challenges one-size-fits-all approaches and underscores the importance of personalized strategies.
2 Methods
2.1 Case Study Design
In our study, we employed a virtual reality (VR)-based conjoint experiment to assess individuals’ preferences and behaviors in response to urban design interventions. This approach features three stages: specifying attributes and their levels, designing 3D neighborhood simulations, and developing an online questionnaire to capture perceptual and experiential responses within a VR environment.
The choice of attributes and levels of VR-based conjoint experiment design were based on earlier works[14]. The experiment focuses on the street block level, chosen for its analogous influence on walking behavior to that of a neighborhood, as supported by literature[26]. This scale allows for detailed 3D modeling and enables participants to closely engage with the urban environment’s features.
Attributes impacting walking behavior—land use mix-diversity, walking facilities, sidewalks, and trees—were identified based on prior research, along with connectivity and open space[14, 27]. Five attributes were selected for the street block experiment, each with two levels: land use mix (exclusively residential or mixed-use), block connectivity (high or low), road size (two lanes with narrow pedestrian zones or one lane with wide pedestrian zones), open space (presence or absence), and green/trees (presence or absence of trees).
From the potential 32 () attribute combinations, we applied a fractional factorial design to reduce the number while preserving orthogonality, ensuring statistical independence among attributes. This approach resulted in eight distinct attribute profiles, facilitating efficient estimation of main effects without attribute correlation[28]. This design strategy enhances our experimental setup’s capacity to discern the individual contributions of various urban design elements to walking behavior, streamlining the process of identifying effective interventions for promoting walkability.
Attributes | Set 1 | Set 2 | Set 3 | Set 4 | Set 5 | Set 6 | Set 7 | Set 8 |
---|---|---|---|---|---|---|---|---|
Land use mix | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 |
Block connectivity | 1 | 1 | 0 | 0 | 1 | 1 | 0 | 0 |
Road size | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 1 |
Open space | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 |
Green/Trees | 1 | 0 | 0 | 1 | 0 | 1 | 1 | 0 |
In the subsequent phase, the eight attribute combinations were translated into corresponding virtual reality (VR) environments. To accomplish this, we constructed the foundational 3D model to represent the Dutch street block by using SketchUp Pro. The dimensions of the experiment area within the street block spanned 300 meters in length and 240 meters in width. While maintaining a consistent road width, we introduced variations in road type: (1) two lanes dedicated to cars with a narrow pedestrian sidewalk, and (2) a single lane for cars with a wider pedestrian pathway.
Regarding the land use mix attribute, we established two levels: the first level entailed an exclusively residential street block, while the second level incorporated a mix of residential and commercial areas such as shops and supermarkets dispersed within the residential area. In terms of the block connectivity attribute, we manipulated the number of intersection points within the street block. Additionally, we introduced variations in the presence or absence of open space, as well as the inclusion or exclusion of street trees to represent the green attribute. By utilizing the 3D sketch models and altering attribute levels accordingly, we generated a total of eight distinct VR environments.
Subsequently, all eight VR environments were imported into Twinmotion, a rapid 3D rendering software[29]. Within Twinmotion, we integrated materials, trees, traffic elements, facilities, and human figures into each of the 3D sketch models, enhancing their realism and creating more interactive experiences. Then, we established a consistent walking perspective and exported all virtual reality environments as video files. To ensure coherence, every video depicting the virtual reality environments adhered to identical parameters, including the walking route, viewing direction, geographical location, duration of sunlight, seasonal conditions, and weather conditions. Each video had a standardized duration of 90 seconds.
Furthermore, to maintain uniformity in the conveyed information, all scenarios incorporated identical 3D objects, such as buildings, trees, and facilities, to represent the respective characteristics. Consequently, participants were presented with consistent attributes throughout all scenarios, including uniform tree colors, building styles, and facility materials. This approach ensured that participants evaluated the environments based on the attributes utilized in their construction, while keeping all other factors constant.
The questionnaire employed in this study is divided into two distinct sections. The first part focuses on individuals’ perceptions of their existing neighborhood and personal characteristics, while the second part encompasses the virtual reality (VR) environment, comprising videos and related questions intended to elicit participants’ perceptions of the VR environments. Within the VR environments section of the questionnaire, participants are prompted to evaluate their experiences of the virtual environments while viewing the corresponding videos. They are asked to provide ratings based on the emotions evoked by each video. To manage the length of the questionnaire, four out of the eight dynamic 3D videos showcasing the VR environments are randomly presented to each respondent. The participant’s perception of each virtual reality environment is assessed through two sections of questions.
The first section encompasses two inquiries pertaining to the environment’s quality, specifically: (1) ”How satisfied are you with the overall quality of this virtual environment?”; and (2) ”How satisfied are you with the walking friendliness of this virtual environment?” Participants are asked to rate their satisfaction using a 7-point Likert scale, ranging from ”not at all satisfied” to ”fully satisfied.” To capture the participants’ feelings during the virtual walk-through experience, the preference rating method introduced by Birenboim et al.(2019) is employed[20]. The second section of the questionnaire focuses on the emotional responses evoked by the virtual environment. Four dimensions of emotions associated with perceived walkability during the virtual walk-through experience are examined: happiness, comfort, annoyance, and security. Participants are requested to indicate the extent to which they experienced each of these emotions. The questions are presented as statements, such as ”I felt happy/comfortable/annoyed/secure.” For each item, respondents provide their responses on a 7-point Likert scale, ranging from ”completely disagree” (1) to ”completely agree” (7), as illustrated in Table 5. Additionally, the second section includes inquiries concerning the perceived benefits derived from the virtual environment.
The data for this study was sourced from earlier works by Liao et al. (2022)[14]. Participants were recruited for this study from a nationally representative consumer panel in the Netherlands, as well as through popular social media platforms such as Twitter (X) , LinkedIn, and Facebook. Upon introducing the virtual reality (VR) environments, respondents were informed that they would be presented with scenarios depicting a typical Dutch street block in a virtual setting. Subsequently, participants were requested to evaluate the overall quality and walking friendliness of the virtual scenarios, as well as provide their subjective emotional responses while viewing these scenarios.
A total of 308 individuals completed the online questionnaire, with 272 respondents sourced from the consumer panel and an additional 36 participants recruited through social media channels. To ensure the reliability and robustness of the data, respondents who provided repetitive answers to each question or who completed the VR portion of the questionnaire in less than 8 minutes were excluded from the analysis. Following the data cleaning process, a final sample of 295 respondents remained for further analysis. All participants were exposed to four 3D-videos, resulting in a total of 1,180 ratings recorded for each item of interest. Consequently, the final dataset consists of observations for each respondent pertaining to 4 virtual walking trips.
2.2 Conditional Average Treatment Effect Estimcation
Rubin’s potential outcomes framework [21] suggests that in studies with a binary treatment, we can only observe the actual outcomes for each subject under the specific treatment they received. To understand a treatment’s effect, we must compare what actually happened to a subject with the treatment to what would have happened without it or with a different treatment. This hypothetical scenario is called the counterfactual outcome. The challenge, especially in observational studies, is accurately estimating these counterfactual outcomes. This difficulty arises because the characteristics of the groups receiving different treatments can vary significantly, leading to biased comparisons. In simpler terms, certain traits may make some people more likely to receive a specific treatment, complicating the task of isolating the treatment’s true effect.
We formalize our problem using a sample dataset which has independent and identically distributed examples , , where represents individual features, is the observed outcome, and stands for treatment options and their assignment status. We posit the existence of potential outcomes corresponding to the outcome we would have observed given the treatment assignment or respectively, such that . One common quantity researchers interested is the average treatment effect for treatment , which is . Here, in contrast, we want to understand how treatment effects vary with the observed covariates , and consider the conditional average treatment effect (CATE):
(1) |
When the effect of multiple joint treatments are interested, we can extend the definition of CATE :
(2) | |||
(3) |
When the effect of a subset of joint treatments are interested, we can define CATE as:
where are the treatment options of interest, and stands for observed treatment assignment for all other treatments other than and . In our simulation and case studies, upto five treatments options () are possible.
2.2.1 Assumptions
In applying Rubin’s causal framework to estimate counterfactuals and label the treatment effect as causal, several critical assumptions must be made:
-
1.
Conditional exchangeability: This means the likelihood of receiving the treatment is based solely on observed covariates. Essentially, it assumes that any unmeasured factors influencing the outcome are equally distributed between the treated and untreated groups, provided that measured confounders are accounted for.
-
2.
Positivity: For every set of covariate values, there must be a nonzero chance of receiving each treatment condition (both treated and untreated). It’s important that the outcome for an individual is independent of the treatment status of others.
-
3.
Consistency: The treatment should be clearly defined. This clarity is necessary to ensure that the level of exposure doesn’t vary across subjects. Under this assumption, if the treatment is well-defined and consistent, we can infer that a subject’s observed outcome would match their counterfactual outcome for their given exposure history.
2.3 Estimation methods
G-formula, propensity score matching and targeted maximum likelihood estimation (TMLE) are popular statistical methods used in epidemiology and other fields for causal inference and adjusting for confounding in observational studies. Here is a brief introduction to each method and their extension to joint treatments and likert scale outcomes.
2.3.1 G-formula and modeling of likert scale outcomes
The g-formula is an analytical tool for estimating standardized outcome distributions by using specific outcome distribution estimates based on covariates, which include both exposures and confounders [30]. This formula can be applied to calculate common measures of association, such as the conditional risk differences (as in Equation 1). In our study, we evaluate the overall likert outcomes observed in our cohort and contrast it with the anticipated likert outcome in the same cohort if a new treatment had been administered.
Equation 1 is a standard application of the g-formula, where we use regressions to estimate the potential likert outcomes under treatment and control conditions (i.e., and ), respectively. To construct the regression, we first standardize the individual likert outcomes which ranges from (Strongly Disagree) to (Strongly Agree) to a scale from 0 to 1 by performing the min-max transformation:
where and max(Y) are the sample minimum and maximum outcomes.
Hence, this can be seen as an extension of logistic regression [31], where typically only two categories, such as ’positive’ and ’negative’, are considered. However, our scenario includes more than two ordered levels. Similar to binary regression, we develop a real-valued predictor function (for instance, a linear function in the case of linear binary regression), aiming to minimize a certain loss function, denoted as , against the target labels. One popular loss function is the logistic loss:
(4) |
which measures the distance from the classification margin. In this context, we get the estimated potential outcome by converting the estimated outcome on probabilistic scale by applying the following formula:
G-computation relies heavily on the correct specification of the outcome model. If the model does not accurately capture the relationship between the covariates, treatment, and outcome, the estimates produced may be biased. Selection bias can occur if certain key variables that influence both the selection into the study population and the outcome are omitted or incorrectly modeled.
2.3.2 Propensity Score Matching
Propensity score matching is a method to reduce selection bias in the estimation of causal effects. It involves matching treated and control units with similar propensity scores. It is intuitive and visually demonstrable, making it easier to communicate to non-statisticians. This method helps to create a balanced dataset by matching treated and untreated subjects with similar characteristics.
The propensity score is the probability of treatment assignment conditional on observed covariates, typically estimated using logistic regression:
(5) |
where is the propensity score.
After estimating the propensity scores, subjects are matched based on the euclidean distance of those scores to form comparable groups. One limitation of propensity score matching is that it can only balance observed covariates, leaving the possibility of bias due to unobserved confounders. Also, it can lead to a loss of data, as unmatched subjects are typically discarded.
2.3.3 Targeted Maximum Likelihood Estimation (TMLE)
TMLE is a semi-parametric estimation method that combines ideas from g-formula and propensity score adjustment [22]. It is a doubly robust estimation method because it has a unique property where the estimator can still be consistent and asymptotically normal if either the outcome or the propensity model is correctly specified.
TMLE involves iteratively updating an initial estimate of a statistical model (such as g-formula) to improve the estimation of a causal effect or a parameter of interest. The updating step is often represented as:
where is an indicator function that equals 1 if the individual received the treatment and 0 otherwise. We update the initial outcome estimate to get a targeted outcome estimate . This is done using a clever mechanism called the clever covariate, denoted as , which is essentially the inverse of the propensity score for each individual. The targeting step adjusts the initial estimate by a factor that depends on the clever covariate and the discrepancy between the observed outcome and the initial estimate. The TMLE estimate of the CATE, denoted as , can be computed as:
TMLE’s strength lies in its ability to handle complex, high-dimensional data and reduce bias in the estimation of causal effects, particularly in observational studies. It achieves this by iteratively refining the estimate to focus specifically on the parameter of interest, hence the name ”targeted”.
However, it’s important to note that double robustness does not imply that TMLE is immune to all forms of bias. For instance, if both the outcome and treatment models are misspecified, TMLE estimates can still be biased. Additionally, TMLE, like any statistical method, is still susceptible to biases from unmeasured confounding, measurement error, and other sources of bias that are not related to the specific models used.
2.3.4 CatBoost algorithm
In our study, we employed the CatBoost algorithm [32] for fitting both the outcome model and the propensity models. CatBoost stands out in survey analysis due to its exceptional handling of categorical data, a common feature in surveys, without needing extensive preprocessing. It’s highly robust against overfitting, crucial for complex datasets, and efficiently manages large datasets, making it ideal for extensive survey data. Additionally, its advanced treatment of missing values is particularly beneficial in survey contexts where such issues are prevalent. CatBoost also offers high accuracy in predictive modeling with features like ordered boosting, and despite its complexity, it provides interpretability through feature importance scores. This combination of efficiency, accuracy, and ease of integration with existing data analysis pipelines makes CatBoost a powerful and versatile tool in both survey analysis and regression tasks.
2.4 Estimation performance using simulated dataset
To benchmark the algorithms, we simulate a dataset containing 5 covariates and 5 treatments, with the outcome variable being a Likert scale ranging from 1 to 7. The data simulation process is designed to explore the impact of these treatments and covariates on the outcome variable, providing insights into potential causal relationships.
The dataset consists of 5000 samples, each with 5 covariates generated from a normal distribution . The treatment effects are uniformly set to 1 for all five treatments, indicating an equal impact of each treatment on the outcome variable. The treatment assignments are influenced by both the covariates and a confounding parameter, with a baseline probability adjusted based on the sum of the first three covariates and a confounding ratio.
The covariates for each sample are generated as follows:
where represents the -th covariate for each sample.
The probability of receiving each treatment is calculated using a combination of the covariates and a confounding factor (confound):
where:
-
•
, with only the first three covariates contributing to the unbalanced confounding,
-
•
represents a small coefficient (e.g., ) applied to the covariates,
-
•
corresponds to the confound ratio, it has been set to and to indicate scenarios with low, medium and high level of confounding.
-
•
is the logistic function, and
-
•
is the treatment assignment for the -th treatment.
The response variable () is generated considering the baseline response, the effect of treatments, the impact of covariates, and random noise:
where:
-
•
denotes the effect of the -th treatment,
-
•
represents the -th covariate,
-
•
is the noise term, and
-
•
the clip function ensures remains within the Likert scale range of 1 to 7.
We generate 50 sets of simulation datasets under each confounding scenario and compute the estimated conditional average treatment effect with the true treatment effect , which can be calculated using the counterfactual as explained in previous sections, the benchmark is computed as percentage error: .
2.4.1 Simulation Result
Figure 1 illustrates a scenario of unbalanced confounding across five covariates from one random set of our simulations under medium confounding. In the unadjusted state, represented by blue circles, the absolute standardized mean differences for covariates 1, 2 and 3 indicate a substantial imbalance, suggesting that the treatment groups are systematically different regarding these covariates.

The application of inverse probability of treatment weighting (IPTW), depicted by purple triangles, aims to adjust for these differences. The effectiveness of this method can be evaluated by the proximity of the adjusted values to the baseline, indicating a mean difference of zero. IPTW adjustment reduces the imbalance in covariates with systematic confounding patterns, while keep others covariates (4 and 5) unadjusted, as evidenced by adjusted values that are still distant from the centerline.

To determine the most suitable approach for estimating the conditional treatment effect, we conducted a comparison of three popular methods in causal inference: G-formula [30], Propensity score matching [33] using simulated virtual experience data.
Figure 2 shows that in contexts of minimal confounding between the intervention and outcome, the inverse probability of treatment weighting (IPTW) method often outperforms the G-formula and targeted maximum likelihood estimation (TMLE) techniques. However, as confounding—or selection bias—grows more pronounced, the G-formula and TMLE demonstrate increased efficacy in treatment effect estimation. IPTW, while effective under certain conditions, shows limitations in both performance and its capacity to predict effects on new data due to reliance on matching within the observed dataset. Consequently, our analysis favors TMLE for its dual robustness, granting it an advantage over the G-formula in terms of accuracy in effect estimation.
3 Results
3.1 Imbalance in Confounders in Urban Design Studies
Our investigation employed a virtual reality (VR) setup to elucidate the variance in walkability perceptions among distinct demographic groups. This analysis was predicated on the application of five targeted design environment interventions (as detailed in Table 2).
Intervention | Levels |
---|---|
Land use mix (LM) | (1) Residential land-use |
(0) Mixed with commercial area | |
Block connectivity (BC) | (1) High connectivity |
(0) Low connectivity | |
Road size (RS) | (1) Two lanes with narrow pedestrian zone |
(0) One lane with wide pedestrian zone | |
Open space (OS) | (1) Has open space in the block |
(0) Does not have open space in the block | |
Green/Trees (GT) | (1) Has trees in the block |
(0) Does not have trees in the block |
The experimental framework facilitated the derivation of 32 () possible combinations of conditional average treatment effects (CATE, refer to the Methodology section for a comprehensive explanation). Illustrated in Figure 3, the absolute standardized mean differences (ASMD) post-propensity score matching reveal a notable contraction in the discrepancies between covariates across treatment and control groups. This phenomenon indicates a moderate degree of confounding present within the VR dataset under examination.

The presence of unbalanced confounders between treatment and control groups introduces potential bias in the estimated CATE from our interventions, which may significantly affect the validity and reliability of causal inferences drawn from the study.
In the context of our study, where diverse demographic groups’ perceptions of walkability are evaluated through design interventions, such imbalances could skew the interpretation of how environmental changes impact pedestrian behavior. Specifically, if certain demographic characteristics or pre-existing conditions are disproportionately represented in either group, the observed effects may not accurately reflect the interventions’ true impact but rather the underlying differences between the groups. This could lead to erroneous conclusions about the effectiveness of design interventions in improving urban walkability. To mitigate this issue, our study utilizes the Targeted Maximum Likelihood Estimation (TMLE) algorithm, an analytical approach designed for precision in the face of measured confounding variables.
3.2 Conditional Average Treatment Effect
The results presented in Table 3 offer a nuanced understanding of how varied urban design interventions influence perceived walkability among different demographic segments.
With the application of a single intervention, Land use mix (LM) showed a modest but positive effect (CATE: 0.92%, SE: 0.602%), indicating that even isolated alterations to land use configurations can significantly enhance walkability perceptions. Block connectivity (BC) further exhibited a beneficial impact (CATE: 1.11%, SE: 0.547%), reinforcing the value of interconnected block designs in urban settings. In contrast, adjustments solely to Road size (RS) led to a marginal decrement in walkability (CATE: -1.21%, SE: 0.489%), suggesting that without additional supportive measures, enlarging road dimensions may adversely affect walkability.
Scenarios featuring Open space (OS) as the sole intervention demonstrated a marked positive effect (CATE: 3.83%, SE: 0.716%), underscoring the critical role open areas play in promoting walkability. Similarly, the exclusive addition of Green/Trees (GT) correlated with a favorable outcome (CATE: 2.03%, SE: 0.525%), highlighting the integral contribution of green elements to the urban environments.
Upon the integration of two or more interventions, combinations involving OS without LM yielded the highest CATE, approximately 4%, manifesting the pronounced impact of open spaces when combined with other urban design elements, except LM, whereupon the CATE diminished to 3.486%. Intriguingly, scenarios incorporating two or more interventions alongside LM experienced a decrease in CATE, illustrating a potential dilution or negative interaction effect when LM is part of multiple intervention strategies.
These findings suggest a complex interplay between various urban design interventions and their collective impact on walkability. The substantial positive effects of OS and GT, even as standalone interventions, reinforce the importance of integrating natural and open spaces within urban landscapes to enhance pedestrian experiences.
Conversely, the observed decrease in CATE with certain combinations implies the need for careful consideration in selecting and implementing intervention mixtures to avoid counterproductive outcomes.
#Interventions | Scenario | Land use mix (LM) | Block connectivity (BC) | Road size (RS) | Open space (OS) | Green/Trees (GT) | CATE |
1 | 1 | 0 | 0 | 0 | 0 | ||
2 | 0 | 1 | 0 | 0 | 0 | ||
3 | 0 | 0 | 1 | 0 | 0 | ||
4 | 0 | 0 | 0 | 1 | 0 | ||
1 | 5 | 0 | 0 | 0 | 0 | 1 | |
6 | 1 | 1 | 0 | 0 | 0 | ||
7 | 1 | 0 | 1 | 0 | 0 | ||
8 | 1 | 0 | 0 | 1 | 0 | ||
9 | 1 | 0 | 0 | 0 | 1 | ||
10 | 0 | 1 | 1 | 0 | 0 | ||
11 | 0 | 1 | 0 | 1 | 0 | ||
12 | 0 | 1 | 0 | 0 | 1 | ||
13 | 0 | 0 | 1 | 1 | 0 | ||
14 | 0 | 0 | 1 | 0 | 1 | ||
2 | 15 | 0 | 0 | 0 | 1 | 1 | |
16 | 1 | 1 | 1 | 0 | 0 | ||
17 | 1 | 1 | 0 | 1 | 0 | ||
18 | 1 | 1 | 0 | 0 | 1 | ||
19 | 1 | 0 | 1 | 1 | 0 | ||
20 | 1 | 0 | 1 | 0 | 1 | ||
21 | 1 | 0 | 0 | 1 | 1 | ||
22 | 0 | 1 | 1 | 1 | 0 | ||
23 | 0 | 1 | 1 | 0 | 1 | ||
24 | 0 | 1 | 0 | 1 | 1 | ||
3 | 25 | 0 | 0 | 1 | 1 | 1 | |
26 | 1 | 1 | 1 | 1 | 0 | ||
27 | 1 | 1 | 1 | 0 | 1 | ||
28 | 1 | 1 | 0 | 1 | 1 | ||
29 | 1 | 0 | 1 | 1 | 1 | ||
4 | 30 | 0 | 1 | 1 | 1 | 1 | |
5 | 31 | 1 | 1 | 1 | 1 | 1 |
3.3 Effect Sensitivity

In Figure 4, we present an analysis of the impact of urban design interventions, contingent on the extent of concurrent interventions. The number of interventions—ranging from one, indicating the exclusive presence of the intervention under consideration, to five, denoting the hypothetical application of all interventions—serves as a proxy for the complexity of urban design strategies. The Conditional Average Treatment Effect (CATE) is computed for select interventions, adjusting for potential confounders and the influence of other concurrent interventions based on Table 3. It reveals a spectrum of responsiveness across interventions: certain measures are more heavily influenced by the number of concurrent interventions than others. This variability underscores a complex interplay among urban design measures and highlights the importance of a calibrated approach in policy formulation and the execution of urban development agendas.
First, the analysis reveals distinct patterns of influence as the number of interventions increases. LM shows an incremental positive trend in average CATE values with additional interventions, starting from a negligible effect at a single intervention (CATE = 0.92%) to a substantially larger effect with five interventions (CATE = 3.49%). This pattern suggests that interventions promoting mixed land use could have compounding beneficial impacts.
Notably, OS initially indicates an effect (CATE = 3.83%) for a single intervention, yet keeps stable above as the number of interventions rises. It consistently exhibits the highest CATE values across all levels of intervention, implying that creating or preserving open spaces could be the most effective singular urban design intervention in our case study.
RS, BC and GT parallels LM with increasing positive CATE values, while GT demonstrates a unique profile with the CATE starting at a higher baseline (CATE = 2.03% for a single intervention) and also enhancing as the number of interventions grows. The precision of these effects is demonstrated by the 95% confidence intervals, which are notably tight for OS, suggesting a high level of confidence in its positive impact.
Overall, the evidence advocates for a strategic approach to urban design, where specific combinations and numbers of interventions can be tailored to maximize the beneficial effects of urban features on the targeted outcomes.
4 Discussion
The present study combines a VR-based conjoint experiment with heterogeneous treatment effect estimation to investigate the impact of urban design interventions on walkability. Recent studies have highlighted the value of VR as a tool in urban design and planning research, as it allows for the controlled manipulation of environmental variables and the effective assessment of user responses in a safe and cost-effective manner[34, 35, 36, 37, 38]. Our research extends these findings by utilizing a VR-based conjoint experiment to explore how demographic characteristics influence perceived walkability under various urban design scenarios.
Employing Targeted Maximum Likelihood Estimation (TMLE) for Conditional Average Treatment Effect (CATE) estimation marks a significant methodological advancement in urban design and planning research, particularly in addressing the challenge of imbalanced confounders[39, 40, 41]. This approach, celebrated for its double robustness, significantly mitigates bias and enhances efficiency, enabling nuanced analyses of urban design interventions on walkability across varied demographic groups[42, 25, 43, 44, 45]. Our integration of TMLE within a VR-based experimental framework underscores the potential to refine research validity and reliability in urban studies. The resulting CATE estimations provide insights into treatment effect heterogeneity, guiding the development of targeted urban design strategies that meet the diverse needs of different neighborhoods, thereby facilitating informed decision-making and resource optimization in urban design and planning[46, 43, 47, 48, 49]. This approach not only enhances our understanding of urban walkability design interventions’ varied impacts but also supports the creation of more inclusive and effective urban walkable environments.
Our results presents a vivid depiction of the concurrent urban design interventions’ impact on walkability, echoing the established narrative that pedestrian experiences are influenced by a complex array of urban elements[50, 51]. The observed enhancements from individual interventions like Land use mix (LM) and Block connectivity (BC) are in line with Ewing and Cervero’s ”5Ds” framework, highlighting critical aspects such as density and diversity for fostering walkability[50]. Conversely, the study underscores the potential drawbacks of enlarging Road size (RS) on its own, supporting Guo and Loo’s critique of road expansions harming pedestrian safety and comfort[44].
The beneficial effects of integrating Open space (OS) and Green/Trees (GT) interventions support the notion, underpinned by the Biophilia hypothesis, that natural elements within urban settings significantly enhance walkability by catering to humans’ inherent affinity for nature[52, 53, 54, 41, 55, 56]. This enhancement is attributed to the role of OS and GT in improving safety perceptions, alleviating stress, and fostering social connections[48, 37, 57], highlighting their strategic value in urban design.
Moreover, this study illuminates the intricate dynamics between various urban design interventions, revealing that certain combinations, especially those including LM, may lead to diminished CATE values, indicating potential counterproductive effects or interactions. This complexity underscores the necessity for a contextually informed and strategic urban design approach, as advocated by Sallis et al. (2016) regarding physical activity and built environments[11]. The continued exploration into the nuanced relationships among urban design features and their collective impact on walkability[48, 37, 57, 47, 38, 41, 55] calls for a refined understanding that can guide the creation of more targeted, effective urban development strategies. By integrating advanced statistical methodologies like TMLE and leveraging findings from recent research, urban planners and policymakers are better positioned to cultivate walkability across varied urban environments.
However, it is essential to acknowledge the potential limitations of our study. While the VR-based experimental setup allows for a controlled manipulation of environmental factors, it may not fully capture the dynamic nature of real-world urban environments and the long-term effects of interventions. Additionally, the TMLE algorithm, although robust, relies on the correct specification of the treatment or outcome models[42]. Misspecification of both models may lead to biased estimates, highlighting the importance of careful model selection and sensitivity analyses. Future research should focus on validating the findings of our study in real-world settings, employing longitudinal designs and field experiments to assess the long-term impact of urban design interventions on perceived walkability and walking behaviors. Incorporating advanced data collection methods, such as long-tracking measurements and ecological momentary assessments[58, 59, 57], can provide a more comprehensive understanding of the dynamic interactions between individuals and the built environments.
The integration of TMLE with cutting-edge machine learning algorithms and sophisticated spatial analysis methodologies presents a promising avenue for future research to improve the precision and dependability of results[60, 40]. The field of machine learning has recently made significant strides, revealing intricate patterns and interconnections within urban datasets[56, 37]. These breakthroughs afford scholars a more comprehensive insight into the intricate nexus of urban planning, walkability, and public health outcomes. Employing spatial analytical techniques, such as geographically weighted regression (GWR) and spatial autocorrelation analysis, is critical for recognizing the spatial dependencies and variations that are characteristic of urban data.
In conclusion, our study leverages the potential of VR technology and advanced statistical methods to explore urban design interventions and their effects on walkability. Through controlled virtual experiments and sophisticated data analysis, we illuminate the complex interplay between urban design elements and their impact on different demographic groups. Our findings underscore the importance of integrating natural spaces, enhancing connectivity, and carefully considering the scale of roadways to foster walkable urban environments. By doing so, our research contributes valuable insights into the optimization of urban spaces that cater to the diverse needs and preferences of city dwellers. Looking forward, it is clear that the intersection of technological innovation and methodological rigor will continue to refine our understanding the role of urban designer and planner in promoting public health and sustainability. As we advance, embracing the complexity of urban environments through comprehensive, data-driven approaches will be paramount in designing cities that are not only walkable but also inclusive, resilient, and conducive to well-being.
Data availability The data that support the findings of this study are available from the corresponding author, [Bojing Liao], upon reasonable request.
Declarations No conflict of interest declared by the authors. All authors agreed to the submission of this version of the manuscript and are responsible for its content.
Funding The present study is supported by the Fundamental Research Funds for the Central Universities (with grant number 20720221045), and the Guangdong Basic and Applied Basic Research Foundation (with grant number 2023A1515110663).
Appendix A Conditional average treatments effect estimation for 2 or more interventions
The interaction among interventions can be represented by the following CATEs:
Here, we notice that when there is no interaction between two interventions, we have
Otherwise, we have:
Proof:
∎
References
- \bibcommenthead
- [1] P.Y. Collins, M. Sinha, T. Concepcion, G. Patton, T. Way, L. McCay, A. Mensa-Kwao, H. Herrman, E. de Leeuw, N. Anand, et al., Making cities mental health friendly for adolescents and young adults. Nature pp. 1–12 (2024)
- [2] B.P. Loo, W.W. Lam, R. Mahendran, K. Katagiri, How is the neighborhood environment related to the health of seniors living in hong kong, singapore, and tokyo? some insights for promoting aging in place. Annals of the American Association of Geographers 107(4), 812–828 (2017)
- [3] B. Giles-Corti, A. Vernez-Moudon, R. Reis, G. Turrell, A.L. Dannenberg, H. Badland, S. Foster, M. Lowe, J.F. Sallis, M. Stevenson, et al., City planning and population health: a global challenge. The lancet 388(10062), 2912–2924 (2016)
- [4] M.J. Nieuwenhuijsen, P. Dadvand, S. Márquez, X. Bartoll, E.P. Barboza, M. Cirach, C. Borrell, W.L. Zijlema, The evaluation of the 3-30-300 green space rule and mental health. Environmental Research 215, 114387 (2022)
- [5] D. Doiron, E.M. Setton, K. Shairsingh, M. Brauer, P. Hystad, N.A. Ross, J.R. Brook, Healthy built environment: Spatial patterns and relationships of multiple exposures and deprivation in toronto, montreal and vancouver. Environment International 143, 106003 (2020)
- [6] S. Hajna, K. Dasgupta, N.A. Ross, Laboratory-assessed markers of cardiometabolic health and associations with gis-based measures of active-living environments. International Journal of Environmental Research and Public Health 15(10), 2079 (2018)
- [7] V. Houlden, J.P. de Albuquerque, S. Weich, S. Jarvis, A spatial analysis of proximate greenspace and mental wellbeing in london. Applied Geography 109, 102036 (2019)
- [8] D.L. Crouse, L. Pinault, A. Balram, M. Brauer, R.T. Burnett, R.V. Martin, A. Van Donkelaar, P.J. Villeneuve, S. Weichenthal, Complex relationships between greenness, air pollution, and mortality in a population-based canadian cohort. Environment international 128, 292–300 (2019)
- [9] S.R. Kellert, E.O. Wilson, The biophilia hypothesis (Island press, 1993)
- [10] G. Felsten, Where to take a study break on the college campus: An attention restoration theory perspective. Journal of environmental psychology 29(1), 160–167 (2009)
- [11] J.F. Sallis, E. Cerin, T.L. Conway, M.A. Adams, L.D. Frank, M. Pratt, D. Salvo, J. Schipperijn, G. Smith, K.L. Cain, et al., Physical activity in relation to urban environments in 14 cities worldwide: a cross-sectional study. The lancet 387(10034), 2207–2217 (2016)
- [12] E. Leslie, T. Sugiyama, D. Ierodiaconou, P. Kremer, Perceived and objectively measured greenness of neighbourhoods: Are they measuring the same thing? Landscape and urban planning 95(1-2), 28–33 (2010)
- [13] O.T. Mytton, N. Townsend, H. Rutter, C. Foster, Green space and physical activity: an observational study using health survey for england data. Health & place 18(5), 1034–1041 (2012)
- [14] B. Liao, P.E. van den Berg, P.J. van Wesemael, T.A. Arentze, Individuals’ perception of walkability: Results of a conjoint experiment using videos of virtual environments. Cities 125, 103650 (2022)
- [15] M.R. Desjardins, E.T. Murray, G. Baranyi, M. Hobbs, S. Curtis, Improving longitudinal research in geospatial health: an agenda. Health & Place 80, 102994 (2023)
- [16] A. Li, E. Martino, A. Mansour, R. Bentley, Environmental noise exposure and mental health: evidence from a population-based longitudinal study. American journal of preventive medicine 63(2), e39–e48 (2022)
- [17] A.T. Kaczynski, K.A. Henderson, Environmental correlates of physical activity: a review of evidence about parks and recreation. Leisure sciences 29(4), 315–354 (2007)
- [18] Y. Zhao, P.E. van den Berg, I.V. Ossokina, T.A. Arentze, Comparing self-navigation and video mode in a choice experiment to measure public space preferences. Computers, Environment and Urban Systems 95, 101828 (2022)
- [19] R. Hurtubia, R. Mora, F. Moreno, The role of bike sharing stations in the perception of public spaces: A stated preferences analysis. Landscape and Urban Planning 214, 104174 (2021)
- [20] A. Birenboim, M. Dijst, D. Ettema, J. de Kruijf, G. de Leeuw, N. Dogterom, The utilization of immersive virtual environments for the investigation of environmental preferences. Landscape and Urban Planning 189, 129–138 (2019)
- [21] D.B. Rubin, Causal inference using potential outcomes: Design, modeling, decisions. Journal of the American Statistical Association 100(469), 322–331 (2005)
- [22] M.J. Van Der Laan, D. Rubin, Targeted maximum likelihood learning. The international journal of biostatistics 2(1) (2006)
- [23] U.W. Hayek, M. Teich, T.M. Klein, A. Grêt-Regamey, Bringing ecosystem services indicators into spatial planning practice: Lessons from collaborative development of a web-based visualization platform. Ecological Indicators 61, 90–99 (2016)
- [24] D. Kasraian, S. Adhikari, D. Kossowsky, M. Luubert, G.B. Hall, J. Hawkins, K. Nurul Habib, M.J. Roorda, Evaluating pedestrian perceptions of street design with a 3d stated preference survey. Environment and Planning B: Urban Analytics and City Science 48(7), 1787–1805 (2021)
- [25] M.S. Schuler, S. Rose, Targeted maximum likelihood estimation for causal inference in observational studies. American journal of epidemiology 185(1), 65–73 (2017)
- [26] J.F. Sallis, Measuring physical activity environments: a brief history. American journal of preventive medicine 36(4), S86–S92 (2009)
- [27] B. Liao, P.E. van den Berg, P.J. van Wesemael, T.A. Arentze, Empirical analysis of walkability using data from the netherlands. Transportation research part D: transport and environment 85, 102390 (2020)
- [28] D.A. Hensher, J.M. Rose, W.H. Greene, Applied choice analysis: a primer (Cambridge university press, 2005)
- [29] L. Wu, J. Sun, J. Wang, Y. Zhang, F. Zhang, in 2020 International Conference on Virtual Reality and Visualization (ICVRV) (IEEE, 2020), pp. 348–349
- [30] M.A. Hernán, J.M. Robins. Causal inference (2010)
- [31] J.D. Rennie, N. Srebro, in Proceedings of the IJCAI multidisciplinary workshop on advances in preference handling, vol. 1 (AAAI Press, Menlo Park, CA, 2005)
- [32] L. Prokhorenkova, G. Gusev, A. Vorobev, A.V. Dorogush, A. Gulin, Catboost: unbiased boosting with categorical features. Advances in neural information processing systems 31 (2018)
- [33] P.C. Austin, E.A. Stuart, The performance of inverse probability of treatment weighting and full matching on the propensity score in the presence of model misspecification when estimating the effect of treatment on survival outcomes. Statistical methods in medical research 26(4), 1654–1670 (2017)
- [34] L. Yin, Q. Cheng, Z. Wang, Z. Shao, ‘big data’for pedestrian volume: Exploring the use of google street view images for pedestrian counts. Applied Geography 63, 337–345 (2015)
- [35] X. Lu, A. Tomkins, S. Hehl-Lange, E. Lange, Finding the difference: Measuring spatial perception of planning phases of high-rise urban developments in virtual reality. Computers, Environment and Urban Systems 90, 101685 (2021)
- [36] H. Jiang, S. Geertman, H. Zhang, S. Zhou, Factors influencing the performance of virtual reality in urban planning: Evidence from a view corridor virtual reality project, beijing. Environment and Planning B: Urban Analytics and City Science 50(3), 814–830 (2023)
- [37] S. Qiao, A.G.O. Yeh, Understanding the effects of environmental perceptions on walking behavior by integrating big data with small data. Landscape and Urban Planning 240, 104879 (2023)
- [38] J. Ferré-Bigorra, M. Casals, M. Gangolells, The adoption of urban digital twins. Cities 131, 103905 (2022)
- [39] B.J. Borah, A. Basu, Highlighting differences between conditional and unconditional quantile regression approaches through an application to assess medication adherence. Health economics 22(9), 1052–1070 (2013)
- [40] H. Kang, A. Zhang, T.T. Cai, D.S. Small, Instrumental variables estimation with some invalid instruments and its application to mendelian randomization. Journal of the American statistical Association 111(513), 132–144 (2016)
- [41] X. Liang, T. Zhang, M. Xie, X. Jia, Analyzing bicycle level of service using virtual reality and deep learning technologies. Transportation research part A: policy and practice 153, 115–129 (2021)
- [42] M.J. Van der Laan, S. Rose, et al., Targeted learning: causal inference for observational and experimental data, vol. 4 (Springer, 2011)
- [43] C. Chen, H. Li, W. Luo, J. Xie, J. Yao, L. Wu, Y. Xia, Predicting the effect of street environment on residents’ mood states in large urban areas using machine learning and street view images. Science of The Total Environment 816, 151605 (2022)
- [44] Z. Guo, B.P. Loo, Pedestrian environment and route choice: evidence from new york city and hong kong. Journal of transport geography 28, 124–136 (2013)
- [45] X. Li, C. Zhang, W. Li, R. Ricard, Q. Meng, W. Zhang, Assessing street-level urban greenery using google street view and a modified green view index. Urban Forestry & Urban Greening 14(3), 675–685 (2015)
- [46] E. Chen, Z. Ye, C. Wang, W. Zhang, Discovering the spatio-temporal impacts of built environment on metro ridership using smart card data. Cities 95, 102359 (2019)
- [47] X. Liang, T. Zhao, F. Biljecki, Revealing spatio-temporal evolution of urban visual environments with street view imagery. Landscape and Urban Planning 237, 104802 (2023)
- [48] B. Beck, M. Winters, T. Nelson, C. Pettit, S.Z. Leao, M. Saberi, J. Thompson, S. Seneviratne, K. Nice, M. Stevenson, Developing urban biking typologies: Quantifying the complex interactions of bicycle ridership, bicycle network and built environment characteristics. Environment and Planning B: Urban Analytics and City Science 50(1), 7–23 (2023)
- [49] Y. Liu, R. Wang, Y. Xiao, B. Huang, H. Chen, Z. Li, Exploring the linkage between greenness exposure and depression among chinese people: Mediating roles of physical activity, stress and social cohesion and moderating role of urbanicity. Health & place 58, 102168 (2019)
- [50] R. Ewing, R. Cervero, Travel and the built environment: A meta-analysis. Journal of the American planning association 76(3), 265–294 (2010)
- [51] H.G. Sung, D.H. Go, C.G. Choi, Evidence of jacobs’s street life in the great seoul city: Identifying the association of physical environment with walking activity on streets. Cities 35, 164–173 (2013)
- [52] E.O. Wilson, Biophilia (Harvard university press, 1986)
- [53] M.J. Koohsari, S. Mavoa, K. Villanueva, T. Sugiyama, H. Badland, A.T. Kaczynski, N. Owen, B. Giles-Corti, Public open space, physical activity, urban design and public health: Concepts, methods and research agenda. Health & place 33, 75–82 (2015)
- [54] Y. Lu, C. Sarkar, Y. Xiao, The effect of street-level greenery on walking behavior: Evidence from hong kong. Social Science & Medicine 208, 41–49 (2018)
- [55] L. Wang, Y. Zhou, F. Wang, L. Ding, P.E. Love, S. Li, The influence of the built environment on people’s mental health: an empirical classification of causal factors. Sustainable Cities and Society 74, 103185 (2021)
- [56] N.G. Polson, V.O. Sokolov, Deep learning for short-term traffic flow prediction. Transportation Research Part C: Emerging Technologies 79, 1–17 (2017)
- [57] G. Fancello, J. Vallée, C. Sueur, F.J. van Lenthe, Y. Kestens, A. Montanari, B. Chaix, Micro urban spaces and mental well-being: Measuring the exposure to urban landscapes along daily mobility paths and their effects on momentary depressive symptomatology among older population. Environment International 178, 108095 (2023)
- [58] B. Chaix, Y. Kestens, C. Perchoux, N. Karusisi, J. Merlo, K. Labadi, An interactive mapping tool to assess individual mobility patterns in neighborhood studies. American journal of preventive medicine 43(4), 440–450 (2012)
- [59] S. Shiffman, A.A. Stone, M.R. Hufford, Ecological momentary assessment. Annu. Rev. Clin. Psychol. 4, 1–32 (2008)
- [60] S. Athey, G. Imbens, Recursive partitioning for heterogeneous causal effects. Proceedings of the National Academy of Sciences 113(27), 7353–7360 (2016)