Assessing Mediational Processes Using Piecewise Linear Growth Curve Models with Individual Measurement Occasions
Abstract
Longitudinal processes often unfold concurrently where the growth of two or more longitudinal outcomes are associated. Additionally, if the study under investigation is long, the growth curves may exhibit nonconstant change with respect to time. Multiple existing studies have developed multivariate growth models with nonlinear functional forms to explore joint development where two longitudinal records are correlated over time. However, the relationship between multiple longitudinal outcomes may also be unidirectional. Accordingly, it is of interest to estimate regression coefficients of such unidirectional paths. One statistical tool for such analyses is longitudinal mediation models. In this study, we develop two models to evaluate mediational processes where the linear-linear piecewise growth model is utilized to capture the change patterns. We define the mediational process as either the baseline covariate or the change in covariate influencing the change in the mediator, which, in turn, affects the change in the outcome. We present the proposed models through simulation studies and real-world data analyses. Our simulation studies demonstrate that the proposed mediational models can provide unbiased and accurate point estimates with target coverage probabilities with a confidence interval. The empirical analyses demonstrate that the proposed model can estimate covariates’ direct and indirect effects on the change in the outcome. We also provide the corresponding code for the proposed models.
Keywords Mediation Processes with Nonlinear Trajectories Unknown Knot Locations Individual Measurement Occasions Simulation Studies
Introduction
In the studies with a longitudinal design, multiple variables of interest are collected together. For example, the primary endpoint and multiple secondary endpoints are often recorded repeatedly in clinical trials. In observational studies, such as The Early Childhood Longitudinal Study, Kindergarten Cohort: (ECLS-K: ) (Tourangeau et al.,, 2013), test scores of multiple disciplinary subjects are collected each school year. Given that a longitudinal process exhibits a nonlinear relationship with respect to time if there are some periods where change is more rapid than in others, earlier studies, such as Blozis, (2004); Blozis et al., (2008); Peralta et al., (2020); Liu and Perera, (2021), have developed multivariate growth models (MGMs) (Grimm et al.,, 2016, Chapter 8) to explore joint nonlinear developmental processes, with a focus on an association between these processes by estimating the covariances of between-construct growth factors.
These MGMs have been shown useful in examining joint nonlinear development; however, the relationship between multiple longitudinal outcomes may also be unidirectional in multiple domains (Liu and Perera,, 2021). In the educational field, for example, reading is a crucial developmental process with wide-ranging implications for later academic performance (Torgesen,, 2002). Accordingly, the investigation of how the early development of reading affects the later development of other disciplinary subjects such as mathematics and science may be more important than the alternative since Akbaşlı et al., (2016) has shown that reading comprehension is one of the most critical factors that contribute to mathematics or science achievement. Similarly, in the biomedical area, we often first observe treatment effects from clinical indices and then from patient-reported outcomes (PROs) such as measures related to illness burden or health-related quality of life. In such cases, it is of greater interest to examine how the change in clinical features affects the change in the PROs. The present study aims to develop longitudinal models to investigate mediational processes to estimate regression coefficients between growth factors.
Simple Mediation Model
In this section, we first introduce a simple mediation model, which is represented in the diagram by Figure 1 (Baron and Kenny,, 1986). As shown in the figure, a simple mediation model contains three variables, a predictor , a mediator , and an outcome , with is proposed as influencing through . Note that in this setup, both and are dependent variables. There are two possible pathways through which the predictor can impact : the path leads from to without passing through (referred to as the direct effect of on ), and the path from to through (referred to as the indirect effect or mediated effect of on ). The direct effect represents how is influenced by directly, while the indirect effect represents how is affected by through a causal sequence where affects , which, in turn, impacts .
=========================
Insert Figure 1 about here
=========================
The primary research question when employing a mediation model is estimating and interpreting the direct and indirect effects. For example, in the simple mediation model, the estimation of these effects is through two linear regression models, one for each dependent variable, as shown below
where and are regression constants, and are errors of and , respectively. In addition, , , and are regression coefficients, among which quantifies the direct effect of on , while the product of the other two coefficients, , quantifies the indirect effect of on . According to Baron and Kenny, (1986), the most common situation in social science research is partial mediation, where the mediator only explains part of the relationship between the predictor and the outcome. Therefore, the total effect of on is the sum of direct and indirect effects (MacKinnon et al.,, 2000).
One challenge in the mediation analyses is how to obtain standard errors of mediated effects, based on which we can conduct a hypothesis test for statistically significant partial mediation and construct a confidence interval (CI) for the indirect effect. Generally, earlier studies have recommended employing the Delta method, for example, Sobel, (1982, 1986); MacKinnon et al., (2002) to approximate the standard error of the product of and , or utilizing bootstrap techniques to calculate more accurate estimates (Hayes,, 2009; Shrout and Bolger,, 2002). Moreover, Cheung and Lau, (2008) has shown that the bias-corrected bootstrap CI performs best in the tests for the mediated effects compared to other methods. In the present study, we do not intend to develop a novel approach to obtaining standard errors of mediated effects or compare different approaches for constructing CIs. Instead, we utilize the Delta method to obtain standard errors of mediated effects as more accurate standard errors are out of the primary research interests of the current study. Note that the standard errors calculated using the Delta method can be automatically obtained from multiple software of the structural equation modeling (SEM) framework such as Mplus 8 (Muthén and Muthén,, 2017) and OpenMx (Boker et al.,, 2020).
Earlier studies, for example, Gollob and Reichardt, (1987) have described multiple fundamental problems of applying mediation models to cross-sectional data. On the one hand, the causal relationships often take time to unfold, implying that the magnitude of the causal effect usually depends on the elapsed time between measurement occasions of the predictor, mediator, and outcome. However, the utilization of cross-sectional data assumes that the causal effects are instantaneous and that the magnitude of a causal effect remains the same over time. On the other hand, using cross-sectional data leaves out measurements at previous times that are potential predictors of the constructed model. Accordingly, longitudinal data are favored for testing the mediation hypotheses. Earlier studies proposed applying the simple mediation model to analyze data where the predictor, mediator, and outcome are from multiple waves. Such applications focus on interindividual differences. However, a critical aspect of a longitudinal study is intraindividual change, which cannot be analyzed explicitly by the simple mediation model (Selig and Preacher,, 2009). Additionally, Cole and Maxwell, (2003); Maxwell and Cole, (2007) have demonstrated that the simple mediation model typically yields substantially biased estimates for longitudinal parameters.
Longitudinal Mediation Model
In this section, we briefly review modeling frameworks to investigate mediation in longitudinal data, which allow for the examination of intraindividual change. Researchers have explored mediational processes in three frameworks: cross-lagged panel models (Cole and Maxwell,, 2003), latent growth curve models (Cheong et al.,, 2003), and latent change score models (MacKinnon,, 2008, Chapter 8). Cole and Maxwell, (2003) and MacKinnon, (2008, Chapter 8) have provided extensive overviews of the application of cross-lagged panel mediation models. At least three waves of all three constructs in a fully-specified model, the predictor, the mediator, and the outcome are needed to test how the predictor at the first wave (i.e., ) affects the mediator at the second wave (i.e., ), and in turn, influences the outcome at the third wave (i.e., ).
The second type of longitudinal mediation model stems from the latent growth curve modeling framework. In the scenario where there are a baseline predictor (i.e., a time-invariant predictor), a longitudinal mediator, and a longitudinal outcome, researchers employ linear growth curve models to depict the intraindividual change of the mediator and the outcome and examine how a baseline predictor affects the status or change of the mediator affects the change in the outcome. For example, Cheong et al., (2003) describes how an intervention program (the predictor) affects the change in the outcome through the change of the mediator using a parallel latent growth curve model. Soest and Hagtvet, (2011) showed that, other than the change in the longitudinal mediator process, its initial status can also be viewed as a mediator in a longitudinal mediation model with a note that the regression of the initial status is comparable to the cross-sectional mediation analysis. In addition, MacKinnon, (2008, Chapter 8) added that the predictor can also be a longitudinal variable and demonstrated how to extend the model to perform mediation analyses among three parallel longitudinal processes.
Earlier studies such as MacKinnon, (2008, Chapter 8) and Selig and Preacher, (2009) also extended the latent change score modeling framework (McArdle and Nesselroade,, 1994), where intraindividual change is captured by latent change scores, to examine longitudinal mediation. This modeling framework allows for the examination of mediated effects among the status of the predictor, mediator, and outcome. Its additional advantage is that it enables researchers to investigate how the change score of the predictor during the first time interval (i.e., ) influences the change score of the mediator during the second time interval (i.e., ), and in turn, affects the change score of the outcome during the third time interval (i.e., ). Selig and Preacher, (2009) have shown that the mediation effects can also be any combination of statuses and change scores of constructs. Earlier studies, for example, MacKinnon, (2008, Chapter 8) and Selig and Preacher, (2009), have provided a detailed discussion in terms of the theory of change for the three variables (i.e., the predictor, mediator, and outcome), the role of time, and the types of indirect effects of these three longitudinal mediation models.
The present study focuses on the second type of longitudinal mediation model, latent growth mediation models. The existing latent growth mediation models developed in Cheong et al., (2003) and MacKinnon, (2008, Chapter 8) assume that change patterns of each repeated variable take the linear functional form. Given that a process may show nonlinear change patterns with respect to time, Cheong et al., (2003) suggested employing a two-stage piecewise (also referred to as a bilinear spline, see Grimm et al., (2016, Chapter 12)) parallel growth model instead of the parallel linear growth model to evaluate nonlinear trajectories. Accordingly, in this present study, we developed two models to investigate mediational processes where the parallel bilinear spline growth model (PBLSGM) (Liu and Perera,, 2021) is employed to examine joint longitudinal processes. The first model can be utilized to evaluate a mediational process with a baseline predictor and subsequently two variables that are repeatedly measured over time, while the second model can be employed to investigate a mediational process with three longitudinal variables.
There are two statistical challenges to utilizing a linear-linear piecewise functional form. First, the change point or knot where the change of the rate-of-change occurs must be determined. Earlier studies have demonstrated that the knot can either be pre-specified by domain theories (Dumenci et al.,, 2019; Flora,, 2008; Riddle et al.,, 2015; Sterba,, 2014) or be estimated as an unknown parameter if the knowledge of the knot is unavailable or the estimation of the knot itself is of research interest. (Cudeck and du Toit,, 2003; Harring et al.,, 2006; Kwok et al.,, 2010; Kohli,, 2011; Kohli et al.,, 2013; Kohli and Harring,, 2013; Preacher and Hancock,, 2015; Liu,, 2019; Liu et al., 2019a, ; Liu et al., 2019b, ; Dominicus et al.,, 2008; McArdle and Wang,, 2008; Wang and McArdle,, 2008; Muniz Terrera et al.,, 2011; Kohli et al.,, 2015; Lock et al.,, 2018). In the two proposed models, we assume that the knot of each longitudinal process is unknown and to be estimated for two considerations. On the one hand, time is important in a longitudinal mediation model, and for each process, the knot is the transition time of the two stages. Accordingly, we want to avoid any unnecessary pre-specification of the knot. On the other hand, the knot can be viewed as a developmental milestone or an event in a longitudinal process. For example, when examining sleep behavior of infants over time, the knot could be the age of crawling since the sleep behavior is expected to be different pre- and post- crawling age (Grimm et al.,, 2016, Chapter 10). Similarly, when analyzing the recovery process from knee replacement surgery, the knot is viewed as the transition time from the recovery of surgical pain to the gradual recovery period (Dumenci et al.,, 2019). So ideally, the knot of the predictor process occurs first, followed by the knot of the mediator process, and then the knot of the outcome process. We want to obtain data-driven evidence for this temporal order from the proposed models.
Second, for a longitudinal process that takes a linear-linear piecewise functional form, we have three statuses, the initial status, the status at the knot (i.e., the change-point at which two linear segments join together), and the status at the end of the study. In addition, we have a slope for each of the two linear pieces, which captures the short-term change rate and the long-term change rate, respectively. Harring et al., (2006) has shown that the degrees of freedom of the linear-linear piecewise functional form is four; therefore, we cannot freely estimate all the statuses and slopes. Most existing studies with an unknown knot usually view the initial status, two slopes, and the knot as four free growth coefficients (Harring et al.,, 2006; Preacher and Hancock,, 2015; Liu et al., 2019a, ; Peralta et al.,, 2020; Liu and Perera,, 2021). In this project, we consider the two slopes, the knot and the measurement at the knot, as the free growth coefficients since the measurement at the knot is more relevant than the initial status to longitudinal mediation analysis given two reasons. First, the initial status of a longitudinal variable only provides information at baseline; therefore, the regression of the initial status of the longitudinal mediator on that of the predictor only estimates instantaneous effects, and so does the regression of the initial status of the longitudinal outcome (Soest and Hagtvet,, 2011). Second, as stated earlier, the knot is usually viewed as a milestone or an event in a longitudinal process; therefore, the status of the knot in one longitudinal variable may influence a long-term change in the other longitudinal process. Suppose we have two types of endpoints to evaluate the treatment effects of psychotherapy: one being a clinical indicator and the other a patient-reported outcome (PRO) related to illness burden and health-related quality of life. It is often expected to observe improvement in clinical indicators first and arrive at a plateau, leading to improved PROs. Therefore, the measurement at the plateau (i.e., the knot) of a clinical indicator is likely to affect the improvement in PROs later on.
Based on the above considerations, we view the knot as unknown for each longitudinal process in the two proposed models and consider the two slopes, the knot, and the status at the knot as the free growth coefficients. We then estimate the two slopes and the status of the knot at the individual level with an assumption that each process-specified knot is fixed across all individuals111Although multiple existing studies, for example, Preacher and Hancock, (2015); Liu et al., 2019a ; Peralta et al., (2020); Liu and Perera, (2021) have demonstrated that the variance of the knot can also be estimated, we assume that each process-specified knot is fixed across all individuals since the knot variance is out of the research interests of the present study. . Additionally, similar to Liu and Perera, (2021), we construct the proposed models in the framework of individual measurement occasions using the ‘definition variables’ approach (Mehta and West,, 2000; Mehta and Neale,, 2005; Sterba,, 2014) to avoid potential inadmissible estimation (Blozis and Cho,, 2008; Coulombe et al.,, 2015) and allow for different time structures across constructs (Liu and Perera,, 2021).
The remainder of this article is organized as follows. In the method section, we start from a bilinear spline growth model to estimate a fixed knot for a univariate longitudinal process. We then extend it to longitudinal mediation models and introduce the model specification of the proposed models. We define the mediational process as either the baseline covariate or the change in covariate influencing the change in the mediator, which, in turn, affects the change in the outcome. Next, we describe the model estimation and model evaluation that is realized by the Monto Carlo simulation. We then present simulation results and evaluate the proposed models in terms of non-convergence rate and the performance measures, which include the relative bias, the empirical standard error (SE), the relative root-mean-squared error (RSME), and the empirical coverage probability for a nominal confidence interval of each parameter. Next, in the application section, we demonstrate how to apply the proposed models to examine a data set of longitudinal records of reading, mathematics, and science abilities from ECLS-K: (Tourangeau et al.,, 2013). Finally, discussions are framed regarding practical considerations, methodological considerations, and future directions.
Method
Bilinear Spline Growth Curve Model with a Fixed Knot
In this section, we briefly describe a latent growth curve (LGC) model with a linear-linear functional form to estimate a fixed knot in the framework of individual measurement occasions. As shown in Figure 2, we specify a separate linear function for each of the two phases of the developmental process and express the measurement for the individual at the time in the framework of individual measurement occasions as
(1) |
where and are the measurement and measurement occasion of the individual at wave . In Equation (1), , and are the individual-level intercept, first slope and second slope, which are usually called ‘growth factors’ in the latent growth curve modeling framework; they three along with the fixed knot together determine the functional form of the growth curve of .
=========================
Insert Figure 2 about here
=========================
Existing SEM software such as Mplus (Muthén and Muthén,, 2017) and the R package OpenMx (Boker et al.,, 2020) does not allow for a conditional statement of an unknown parameter such as in Equation (1), so the piecewise functional form above cannot be specified directly. In order to unify pre- and post-knot expressions, we have to reparameterize growth factors, which can be realized through multiple approaches. Harring et al., (2006) proposed to reparameterize three growth factors in Equation (1) (i.e., , and ) to the average of the two intercepts, the average of the two slopes, and the half difference between the two slopes. Alternatively, Grimm et al., (2016, Chapter 11) suggested reexpressing the three growth factors to the measurement at the knot and two slopes. Additionally, Liu et al., 2019a reparameterized , and as the measurement at the knot, the average of the two slopes, and the half difference between the two slopes.
The reparameterized growth factors proposed in Harring et al., (2006); Liu et al., 2019a are no longer directly related to the underlying developmental process and therefore are less useful in the present study where we want to estimate regression coefficients between original growth factors. Accordingly, we follow the reparameterized approach in Grimm et al., (2016, Chapter 11), where all three reparameterized coefficients are still directly related to the growth patterns. We then write the repeated outcome as
where is the measurement at the knot of the individual.
Model Specification of Mediation Model with Baseline Predictor, Longitudinal Mediator, and Longitudinal Outcome
In this section, we extend the univariate linear-linear latent growth curve model to analyze a mediational process with baseline predictor, longitudinal mediator, and longitudinal outcome. Suppose both the mediator and the outcome take the linear-linear functional form with an unknown fixed knot. In such situations, we can construct a model for a mediational process where the baseline predictor influences the change in the outcome directly and indirectly through its effect on the change in the mediator. The longitudinal mediation model can be specified as
(2) |
where and are a vector of the repeated measurements of the mediator and outcome of the individual, respectively (in which is the number of measurement occasions). With an assumption that the trajectories of mediator and outcome process take the linear-linear functional form with an unknown knot, () is a vector of growth factors,
in which , , and represent the slope of the first stage, the measurement at the knot, and the slope of the second stage, respectively. The corresponding factor loadings , a matrix, is expressed as
where the subscript indicates that we build the model in the framework of individual measurement occasions. Additionally, is a vector of residuals of the individual.
To define a mediation model with a baseline predictor, a longitudinal mediator, and a longitudinal outcome, we regress the growth factors of the mediator process on the predictor for the individual
(3) |
where , which is either continuous or binary, is the baseline covariate of the individual, is a vector of growth factor intercepts of the mediator process (which is the mean vector of growth factors of the mediator if the covariate is centered), and is a vector of regression coefficients from the predictor to the first slope, to the measurement at the knot, and to the second slope of the mediator process, that is
(4) |
Additionally, we need to regress the growth factors of the outcome process on the predictor and the growth factors of the mediator process
(5) |
where is a vector of growth factor intercepts of the outcome process, is a vector of regression coefficients from the predictor to the first slope, to the measurement at the knot, and to the second slope of the outcome process, that is
(6) |
Additionally, is a matrix of regression coefficients from the growth factors of the mediator process to those of the outcome process. The coefficient matrix can be expressed as
(7) |
In Equations (4), (6), (7), the superscript and subscript of together define the path of the corresponding coefficient. For example, is the path coefficient from the baseline covariate to the first slope of the mediator process. Similarly, is the coefficient from the first slope of the mediator process to the knot measurement of the outcome process. Additionally, is a vector of deviations of the individual from the mean values of the variable-specific growth factors. We list six possible paths of indirect effects of the predictor on the outcome process in the specified model in Table 1.
=========================
Insert Table 1 about here
=========================
Model Specification of Mediation Model with Longitudinal Predictor, Mediator and Outcome
In this section, we extend the univariate linear-linear latent growth curve model to investigate a mediational process with a longitudinal predictor, mediator, and outcome. Suppose all three variables take a linear-linear functional form with an unknown fixed knot. In such situations, we can construct a longitudinal mediation model to explore how the change in covariate influences the change in the outcome directly and indirectly through its effect on the change in the mediator. The longitudinal mediation model can be specified as
(8) |
which defines a longitudinal process for the predictor in addition to the mediator and outcome processes specified in Equation (2). In Equation (8), is a vector of the repeated measures of the predictor for the individual. In addition, , , and are the growth factors (a vector), the corresponding factor loadings (a matrix), and the residuals (a vector) of the covariate process, respectively. We then write the growth factors of the covariate process as deviations from the corresponding growth factor means
(9) |
where and are the mean vector of growth factors (a vector) and the deviations (a vector) of the covariate process. The growth factors of the mediator process are regressed on the those of the covariate, as shown below
(10) |
where is a vector of growth factor intercepts of the mediator, is a matrix of regression coefficients from the growth factors of the covariate process to those of the mediator process
(11) |
Similarly, the growth factors of the outcome process are regressed on those of the covariate and mediator
(12) |
where is a vector of growth factor intercepts of the outcome, () is a matrix of regression coefficients from the growth factors of the covariate (mediator) process to those of the outcome process. The coefficient matrix is defined as
(13) |
and has the same expression as such in Equation (7). Ten possible paths of indirect effects of the predictor process on the outcome process in the specified model are also listed in Table 1.
Model Estimation
To simplify estimation, we make the following four assumptions. First, we assume that the covariate’s growth factors are normally distributed. Second, we assume that the mediator’s growth factors are normally distributed conditional on the baseline predictor (or growth factors of the predictor process). The third assumption is that the outcome’s growth factors are normally distributed conditional on the baseline covariate (or growth factors of the covariate process) and the growth factors of the mediator process. Accordingly, (), where is a (unexplained) variance-covariance matrix of growth factors of the variable u. We also assume that the individual residuals are identical and independent normal distributions over time, and the covariances between residuals are homogeneous over time. We define as a identity matrix, and for the first and second proposed models, the corresponding variance-covariance matrix of residuals can be expressed as
and
respectively.
The parameters in the first proposed model include the mean and variance of the covariate, the intercepts (), and unexplained variance-covariance () of the growth factors of the mediator (outcome) process, the coefficients from the covariate to the mediator (outcome) growth factors and those from the mediator growth factors to the outcome growth factors, the mediator (outcome) residual variance, and the residual covariance. We define as
to list the parameters specified in the first proposed longitudinal mediation model, where and are those defined in Equations (4) and (6), respectively.
The parameters in the second proposed model include the mean vector () and variance-covariance matrix () of the growth factors of the predictor process, the intercepts () and unexplained variance-covariance () of the growth factors of the mediator (outcome) process, the coefficients between growth factors of three longitudinal processes, the residual variances and covariances. We define as
and list the parameters specified in the second longitudinal mediation model, where and are those defined in Equations (11) and (13), respectively.
We estimate and using full information maximum likelihood (FIML) to account for the individual measurement occasions and the potential heterogeneity of individual contributions to the likelihood function. In this work, we constructed the proposed longitudinal mediation models using the R package OpenMx with CSOLNP optimizer (Pritikin et al.,, 2015; Neale et al.,, 2016; Boker et al.,, 2020; Hunter,, 2018).
In addition to the parameters that can be estimated directly, in practice, we are also interested in estimating the conditional mean vector and variance-covariance matrix of the growth factors of the mediator (outcome) process and indirect effects as well as the total effects listed in Table 1. When fitting the model using the R package OpenMx, we can specify the additional parameters in the function mxAlgebra() (Boker et al.,, 2020). We need to provide the algebraic expressions of the conditional mean vector and variance-covariance matrix of growth factors of the mediator (outcome) process and those in Table 1; then OpenMx is capable of computing the point estimates along with their standard errors of these additional parameters (that is realized by the Delta Method). We provide the expressions of the conditional mean vector and variance-covariance matrix of growth factors of the mediator (outcome) process for the two models in Appendix Appendix A and Appendix Appendix B, respectively. Additionally, we provide OpenMx code in the online appendix (https://github.com/Veronica0206/Extension_projects) and a demonstration to show how to build the proposed models and estimate these additional parameters.
Model Evaluation
In this section, we perform simulation studies to evaluate the performance of the proposed models. We first discuss the simulation design for the two longitudinal mediation models, along with the performance metrics and the consideration for the number of replications. After that, we describe the general steps for data generation.
Design of Simulation Study
We provide the detailed simulation designs for the two longitudinal mediation models in Table S1 and Table S2. The parameters of the most interest in the proposed model are the coefficients , and , based on which we obtain direct, indirect, and total effects that the baseline predictor (or predictor process) has on the outcome process. The conditions hypothesized to influence the estimation of these coefficients and other model parameters include sample size, the number of repeated measurements, the knot locations, the ratio of indirect effects to direct effects, shapes of trajectories, and measurement precision. Accordingly, we fixed the factors, for example, the mean vectors and variance-covariance matrices of the growth factors of longitudinal processes, that presumably do not affect the model performance meaningfully.
The primary objective of the simulation study is to evaluate how the proportion of mediated effects affects the proposed models. Accordingly, for the first mediation model, we fixed so that the predictor account for variability of the growth factors of the outcome process (i.e., the medium level according to Cohen, (1988, Chapter 9)). Additionally, we considered three levels of so that the predictor explains zero, medium (i.e., ), and substantial (i.e., ) variability of the growth factors of the mediator process. We then manipulated to further adjust the proportion of mediated effects (see the setup of coefficients for each condition in Table S1 in the Online Supplementary Document). For the elements in , we considered the diagonal coefficients and off-diagonal coefficients separately since the diagonal elements are related to the immediate effects222The immediate effects could be, for example, the coefficient from the first slope of the mediator process to the first slope of the outcome process or that from the knot measurement of the mediator process to the knot measurement of the outcome process., while the off-diagonal elements are related to the delayed effects333The delayed effects could be, for example, the coefficients from the first slope of the mediator process to the knot measurement of the outcome process. in a mediational process. For all three levels of , we considered three levels of indirectly immediate effects and two levels of indirectly delayed effects. In addition, for where the predictor explains variability of the growth factors of the mediator process, we considered two additional conditions to examine how varying path coefficients with fixed mediated effects affect model performance. Among all conditions, the maximum mediated effects from to the outcome growth factors are provided in the footnote of Table S1. Note that in the condition with the maximum mediated effects, the sum of the indirect effects achieved of the total effect, which is comparable to other methodological work such as Cheong et al., (2003).
Similarly, in the second longitudinal mediation model, we fixed so that the growth factors of the predictor process account for variability of the corresponding immediate growth factor and variability of each delayed growth factor of the outcome process. We considered two levels of so that the growth factors of the predictor process account for () variability of the corresponding immediate growth factor and () variability of each delayed growth factor of the mediator process (see the setup of coefficients for each condition in Table S1 in the Online Supplementary Document). For each level of , we assessed two levels of non-zero coefficients for the elements in . Among all conditions, the maximum mediated effects from the growth factors of the predictor process to those of the outcome processes are provided in Table S2. Note that in the condition with the maximum mediated effects, the sum of the indirect effects achieved of the total effect, which is comparable to other methodological work such as Cheong et al., (2003).
Another important factor in the simulation design to evaluate longitudinal models is the number of repeated measures since the proposed model is employed to analyze a longitudinal data set. Intuitively, the model should perform better with more repeated measurements. We also realized that the transition time to the second stage of the covariate (mediator) process, ideally, should occur no later than that of the mediator (outcome) process in longitudinal mediation models. We then decided to test two different levels of the number of measurements: six and ten. We selected six as the minimum number of repeated measurements to ensure the proposed model was fully identified444Bollen and Curran, (2005, Chapter 4) has shown that the latent growth model with linear-linear functional form can be identified with at least five waves with a knot at the midway of the study duration, though no studies provided information on identification for the model with an unknown knot.. For the conditions with six repeated measures, we set the transition time to the second stage of the mediator and outcome process at halfway through the study duration (, ). We considered the other condition, ten measurement occasions, for two reasons. On the one hand, we wanted to examine whether an increasing number of repeated measures would improve model performance. More importantly, this condition allowed us to evaluate the model performance under two scenarios: (1) the knot of each longitudinal process is in the middle of the study duration (i.e., , ); (2) the transition time of the mediator (covariate) process occurs earlier than that of the outcome (mediator) process (i.e., and for the first model, while , and for the second model). In addition, to account for individual measurement occasions, we allowed a time window with width around each wave, which is considered a ‘medium’ deviation as in Coulombe et al., (2015). Moreover, we examined three common change patterns, two levels of residual variances ( or ) and set the residual correlation as . We also examined the models at two levels of sample size, and .
We evaluate the proposed longitudinal mediation models through the performance measures, including the relative bias, the empirical standard error (SE), the relative root-mean-square error (RMSE), and the empirical coverage probability for a nominal confidence interval of each parameter of the proposed models. Table 2 lists the definitions and estimates of these four performance measures. Specifically, the relative bias and empirical SE quantify whether the model, on average, targets the population values and whether estimates are precise, respectively. The relative RMSE integrates the bias and the precision metric into one measure. The coverage probability tells how well the interval estimate covers the corresponding population value.
=========================
Insert Table 2 about here
=========================
We decided to replicate the simulation study by an empirical approach following Morris et al., (2019). Specifically, we run a pilot simulation study, the standard error of bias of all parameters except the (unexplained) variance of knot measurement of each longitudinal process (i.e., , ) was less than . Therefore, we need at least replications to keep the Monte Carlo standard error of the bias below 555 (Morris et al.,, 2019).. Out of more conservative consideration, we decided to proceed with .
Data Generation and Simulation Step
We carried out the following general steps for the simulation study of the proposed mediation models:
-
1.
Generate the baseline predictor (or growth factors of the predictor process), growth factors of the mediator process, and those of the outcome process simultaneously with the prespecified mean vectors and variance-covariance matrices (details are provided in Appendix Appendix C) using the R package MASS (Venables and Ripley,, 2002),
-
2.
Generate a scaled and equally-spaced time structure with waves and obtain individual measurement occasions: by allowing a time-window with width around each wave,
-
3.
Calculate factor loadings for each individual of the (predictor or) mediator or outcome process from individual measurement occasions and the process-specific knot location,
-
4.
Calculate the values of the repeated measurements for each longitudinal process from corresponding growth factors, factor loadings, as well as residual variances and covariance,
-
5.
Implement the proposed models on the generated data set, estimate the parameters, and construct corresponding Wald CIs,
-
6.
Repeat the steps as mentioned above until having convergent solutions.
Results
Model Convergence
We first examined the convergence666The convergence was defined as arriving at OpenMx status code that indicates a successful optimization until up to runs with different sets of initial values (Neale et al.,, 2016). rate under each condition for each model before evaluating how the proposed longitudinal mediation models performed. The proposed model converged satisfactorily. For the first model, where we have baseline covariate and longitudinal mediator and outcome, conditions out of conditions reported a convergence rate. Only condition(s) reported up to five non-convergence replications. For the second model, in which we have longitudinal covariate, mediator, and outcome, 166 conditions out of a total 180 conditions reported a convergence rate. Only condition(s) reported up to five non-convergence replications.
Performance Measures
We present the performance metrics, including the relative bias, empirical standard error (SE), relative root-mean-square error (RMSE), and empirical coverage probability (CP) for a nominal confidence interval of each parameter for the proposed models. For each parameter of interest, given the size of the conditions and parameters, we first obtained its four performance measures over replications under each condition in the simulation design. We then summarized these metrics across conditions as the corresponding median and range777For relative biases/RMSEs, we removed the conditions with coefficients with zero population values when summarizing the related summary statistics.. We provide the summary of the four performance metrics for the two models in Table S3-Table S5. These tables show that both models are capable of estimating parameters unbiasedly and precisely and exhibiting target confidence interval coverage in general.
Both models yielded unbiased point estimates along with small empirical standard errors generally. Specifically, for both models, the magnitude of relative biases of the growth factor means was under , suggesting that we are able to have unbiased mean trajectories from both models. Additionally, for the first model, the magnitude of relative biases of the coefficients from the covariate to the mediator (outcome) growth factors was under . Yet some coefficients from mediator growth factors to outcome growth factors, for example, and exhibit some bias (i.e., the relative biases were greater than ). These biased coefficients further led to biased estimates for the mediated effects. For the second model, the magnitude of the coefficients from covariate growth factors to mediator (outcome) growth factors except () was under . Some coefficients from the mediator growth factors to the outcome growth factors exhibit some bias.
For the first model, we provide Figures 3 and 4 to further examine the relative bias pattern for the coefficients and indirect effects with some bias greater than , respectively. From Figure 3, we noticed that the estimates of and were satisfactory in general (i.e., less than relative bias). Most factors we considered in the simulation design, such as the trajectory shapes and knot locations, did not affect the relative bias meaningfully. These biased estimates were generated under the conditions with a smaller sample size (), a shorter study duration (i.e., six measurements), and a larger residual variance (i.e., ). There are two additional findings from this figure. First, the first model tended to overestimate the path coefficient from the first slope of the mediator to the second slope of the outcome, yet to underestimate the path coefficient from the second slope of the mediator to the second slope of the outcome. Second, the magnitude of relative biases was relatively small under the conditions where there was a relatively greater relationship between the mediator and the outcome. From Figure 4, we also noticed that the mediated effect involved was likely to be overestimated, while that involved was likely to be underestimated. In addition, we noticed that varying path coefficients with fixed mediated effects might slightly affect the performance of the first model.
=========================
Insert Figure 3 about here
=========================
=========================
Insert Figure 4 about here
=========================
For the second model, we plot Figures 5 and 6 to further examine the relative bias pattern for the coefficients and indirect effects with some bias greater than , respectively. From Figure 5, we noticed that the estimates of these coefficients were satisfactory in general (i.e., less than ). Some factors we considered in the simulation design, such as the trajectory shapes, the knot locations, and the magnitude of indirect effects, did not affect the relative bias meaningfully. These biased estimates were generated under the conditions with a smaller sample size (), a shorter study duration (i.e., six measurements), and a larger residual variance (i.e., ). Similar to Model 1, Model 2 was likely to overestimate path coefficients from a first slope to a second slope, but to underestimate the coefficients between two second slopes. From Figure 6, we noticed that the mediated effects involved path coefficients from a first slope to a second slope were tended to be overestimated. In contrast, those involved path coefficients between second slopes were tended to be underestimated.
=========================
Insert Figure 5 about here
=========================
=========================
Insert Figure 6 about here
=========================
Moreover, estimates obtained from the proposed models were precise: the magnitude of empirical standard errors of the slope- or knot-related parameters was under , although this value of the parameters related to the knot measurement could achieve . The relatively large empirical SEs of the parameters related to the knot measurement were due to the large scale of their population values (the population value of intercept means was around ).
The proposed models are capable of estimating parameters accurately. For both models, the magnitude of relative RMSEs of growth factor means was under , and for knot was under . The relative RMSE magnitude of the path coefficients, indirect effects, and direct effects was relatively large. In addition, both models performed well regarding empirical coverage since the coverage probabilities of all parameters were around . We noticed that the coverage probabilities of indirect effects could achieve , which is greater than the nominal coverage probability (). As stated earlier, we constructed a Wald confidence interval with the SE obtained by the Delta Method and generated by mxAlgebra() automatically for each indirect effect, which should be very similar to the Sobel SE (Sobel,, 1982, 1986; Baron and Kenny,, 1986). Accordingly, the issue of overestimated SE is in line with an earlier study (Cheung,, 2009) and within our expectations to see.
Application
In this section, we demonstrate how to employ the proposed longitudinal mediation models to analyze longitudinal records from the ECLS-K: . This section includes two examples. We illustrate how to apply the first model to investigate how the baseline attentional focus affects the development of reading ability and then the development of mathematics ability. We then demonstrate how to implement the second model to explore how the development of reading ability affects that of mathematics and science ability. We randomly extracted students from ECLS-K: 2011 with complete records of repeated reading, mathematics, science item response theory (IRT) scaled scores, age at each wave, and baseline attentional focus888There are participants in ECLS-K: 2011 in total. After removing records with missing values (i.e., rows with any of NaN/-9/-8/-7/-1), the number of entries is ..
ECLS-K: 2011 is a national longitudinal study of children registered in about kindergarten programs starting from school year in the United States. In ECLS-K: 2011, participants’ reading and mathematics ability were evaluated in nine waves: fall and spring of kindergarten (), first () and second () grade, respectively, as well as spring of (), () and () grade, respectively. Only about students were evaluated in the fall of and (Lê et al.,, 2011). Students’ science assessment started from the spring of kindergarten; accordingly, it was only examined in eight waves. We used children’s age (in months) instead of their grade-in-school to obtain individual measurement occasions in analyses. We considered baseline attentional focus (ranging between and ) as a continuous variable, and its corresponding mean (SD) was 4.91 (1.28) in the extracted subset. Additionally, in the subsample, and of children were boys and girls. Additionally, the extracted sample was represented by White, Black, Latinx, Asian, and others.
Univariate Development
Ideally, as stated earlier, the knot of the covariate process, indicating a milestone or an event in the process, occurs first, followed by the knot of the mediator process and then that of the outcome process. Therefore, we can specify the paths from the covariate knot to the other two knots and the path from the mediator knot to the outcome knot if we observe this temporal order. Otherwise, these paths between the knots may not be reasonable since we usually do not expect a later event predict an earlier one. Therefore, in this section, we first constructed a univariate latent growth curve model with the linear-linear functional form to estimate the fixed knot for each longitudinal process and then decide on possible paths for the proposed models. The estimated knot of the development of reading, mathematics, and science ability was around , , and months, respectively, suggesting that we can construct longitudinal models with the paths between these knots.
Baseline Attentional Focus and Longitudinal Reading and Mathematics Ability
In this section, we built the first longitudinal mediation model to assess how the baseline attentional focus affects the development of reading ability and then the development of mathematics ability. As shown in Figure 7(a), the estimates obtained from the first model lead to a model trajectory that sufficiently captures the smooth line of observed trajectories for each ability. Table 3 presents the estimates of parameters of interest for the first model. Post-knot development of both abilities slowed down substantially. The transition to the slower rate occurred earlier in reading ability ( months) than in mathematics ability ( months). Baseline attentional focus positively affects the pre-knot development and the knot measurement of reading ability, and the knot measurement of mathematics ability. For example, the knot measurement of reading ability and mathematics ability increased and with one standardized unit increase in baseline attentional focus. Moreover, a child who developed more rapidly in reading ability tended to develop more rapidly in mathematics; a child who had higher scores at the knot of reading development tended to have higher mathematics scores at its knot. Additionally, pre-knot development of reading affects the knot measurement of mathematics development positively (estimate was with the p-value of ).
=========================
Insert Figure 7 about here
=========================
=========================
Insert Table 3 about here
=========================
In terms of mediated effects, the baseline attentional focus is associated with the pre-knot development, knot measurement, and post-knot development of the mathematics process indirectly through the corresponding counterpart of the reading process. In addition, we noticed that the effects on the post-knot development of both abilities of the baseline attentional focus were trivial.
Longitudinal Reading, Mathematics and Science Ability
In this section, we built the second longitudinal mediation model to evaluate how the development of reading ability affects the development of mathematics ability and then the development of science ability. From Figure 7(b), the estimates obtained from the second model also lead to a model-implied trajectory that sufficiently captures the smooth line of observed values for each disciplinary subject. We list the estimates of parameters of interest for the second model in Table 4, where we can see post-knot development of science ability slowed down slightly. The change to the slower rate occurred earliest in reading ability ( months), followed by mathematics ability ( months), and then science ability ( months). In general, a child who developed more rapidly in reading ability tended to develop more rapidly in mathematics (science) at an early stage, and a child who had higher scores at the knot of reading development tended to have higher mathematics (science) scores at the corresponding knot. We also noticed that pre-knot development of mathematics impacted the knot measurement of science IRT scores negatively, which seems counter-intuitive. We will provide possible explanations in the Discussion section.
=========================
Insert Table 4 about here
=========================
In terms of mediated effects, the pre-knot development, knot measurement, and post-knot development of the reading ability positively affected the corresponding values of science ability through the counterpart of mathematics ability. Additionally, pre-knot development of reading ability influenced the knot measurement of mathematics scores and the knot measurement of science scores. The knot measurement of reading ability also impacted the post-knot development of mathematics skills and the post-knot development of science skills. In terms of total effects, all three growth factors of reading development affected the corresponding immediate and delayed growth factors of science development. For example, the total effects of the pre-knot development of reading ability on the pre-knot development, knot measurement, and post-knot development of science ability were , , and , respectively.
Discussion
In this article, we propose two longitudinal mediation models to evaluate how a covariate, either its baseline value or its process, affects a mediator process and thus an outcome process. We employ the bilinear spline functional form to approximate the underlying change patterns of all longitudinal processes in the proposed models. This functional form allows for exploring how the change rate during the early stage or the value at a milestone (i.e., the measurement at a knot) of the covariate (mediator) process affects the change rate during the late stage of the mediator (outcome) process. We conducted extensive simulation studies to evaluate the proposed models in terms of convergence rate and performance metrics, including the relative bias, empirical standard error, relative RMSE, and coverage probability. We also illustrate the proposed models using a real-world data set from longitudinal records of reading, mathematics, and science IRT scores from Grade to Grade . The results demonstrate the models’ valuable capabilities of capturing the underlying change patterns of nonlinear longitudinal processes and estimating direct and indirect effects on the development of the outcome of the covariate.
Practical Considerations
In this section, we provide a set of recommendations for empirical researchers who are interested in employing the proposed models. First, we recommend building a univariate growth curve model before constructing the proposed models, as we did in the Application section. These univariate growth curve models allow us to assess whether the proposed functional form can capture the underlying change patterns of each longitudinal process. More importantly, we obtain the estimated knot of each longitudinal process, which allows for examining whether the temporal order of the covariate knot, mediator knot, and outcome knot is satisfied. If we have the temporal order as in the Application section, it is reasonable to add the paths between these knots. Otherwise, we may want to remove the paths to avoid a logical fallacy. For example, if the outcome knot occurred earlier than the mediator knot, we may not want to have the path from the mediator knot to the outcome knot in longitudinal mediation models.
Second, we constructed full models in the simulation study to examine model performance in estimating each possible parameter within research interests. However, we suggest only using the models with all possible paths for exploratory purposes. We recommend constructing a parsimonious model for a more confirmatory analysis due to two considerations. First, it helps avoid the potential collinearity issue in a confirmatory model, which we encountered in the Application section. In the second example, we found that the direct effects on the knot measurement of science ability of the pre-knot development of reading and mathematics ability were and , respectively. Although pre-knot development of mathematics ability might negatively affect the knot measurement of science ability, a more reasonable explanation is that the pre-knot development of reading and mathematics abilities were highly correlated. Therefore, the estimate of either individual effect was invalid, although the bundle of the pre-knot development of the two abilities can still predict the knot measurement of science ability. Accordingly, the paths in a more confirmatory model, where the individual effect is of interest, need to be carefully determined by specific research questions. More importantly, a parsimonious model with fewer paths may help model identification by decreasing the likelihood of the appearance of a not positive-definite sample covariance matrix, a common error in all models in the SEM framework. One underlying mechanism is related to the proposition regarding the positive definite property of the symmetric block matrix, as demonstrated in Appendix Appendix C).
Third, the models proposed and the analyses performs in the two examples in the Application section were pedagogical. We analyzed how the baseline attentional focus affects mathematics development through reading development by constructing the first model. Although we can maintain the temporal order, we cannot claim causality. The alternative pathway that is attentional focus affects the mathematics ability and then influences the reading ability is also plausible. Similarly, an alternative pathway for the second model could be that the development of mathematics development affects science development via the mediation of reading development. Moreover, in the proposed models, only the relationship between those between-construct growth factors is captured by unidirectional paths; that is, the development during the first phase of one ability is not considered a predictor of the development during the second phase of the same ability. Therefore, Equations A.2, A.4, B.2 and B.4 are helpful to obtain the covariance (correlation) between such within-construct growth factors if it is of research interest. In addition, similar to all mediation models, other confounders, such as learning approach or socioeconomic status, could also explain the relationship.
Last, researchers should be aware of the multiple statistical tests conducted when fitting a longitudinal mediation model and should consider adjusting hypothesis tests to control the Type I error rate or false discovery rate. Furthermore, decisions about the importance of insights should not only be based on p-values but also consider effect sizes, prior evidence, and alternative explanations (Wasserstein et al.,, 2019). For instance, in the first example, the effects size of the baseline attentional focus on the post-knot development of reading ability was . Although it is statistically significant at the level, this finding may not be meaningful in practice due to the trivial effect size. Additionally, one should carefully interpret the effect of the pre-knot slope of the covariate to the post-knot slope of the mediator (outcome), especially under conditions with fewer measurement occasions and less precise measurement where the estimates could exhibit some bias greater than based on our simulation study.
Methodological Considerations and Future Directions
There are several directions to consider for future studies. First, the present study allows for employing the bilinear spline functional form to approximate nonlinear change patterns. This functional form is versatile and valuable for exploring longitudinal processes where different change rates correspond to different stages. More importantly, the estimated knot, which indicates a milestone or an event of a longitudinal process, is especially useful in a mediation model because it helps avoid a logical fallacy and ensures consistency between theory and statistical methods. Although the linear-linear piecewise functional form is sufficient in most cases, such as a developmental process with earlier and later stages in psychological phenomena or a recovery process with short-term and long-term recovery periods in biomedical fields, a functional form with more linear segments may also prove useful in practice. The proposed models can also be extended accordingly.
Second, we constructed the Wald confidence interval (CI) with the SE obtained by the Delta Method for the indirect effects and the total effects in the simulation studies and real-world data analyses. In the simulation studies, we noticed that the coverage probabilities could achieve , greater than the nominal coverage probability (). Another possible future direction is to construct other types of confidence intervals, such as the bootstrap CIs. Additionally, it is worth conducting simulation studies to evaluate these alternative methods for constructing CIs.
Third, we may not want to include all possible paths to construct a parsimonious model and avoid the potential collinearity issue. Although we recommend building the reduced model driven by specific research questions, it is still worth developing hypothesis testing for comparing the full model and reduced models, especially in an exploratory stage where even empirical researchers only have vague assumptions about causal relationships. In addition, the development of longitudinal mediation models that allow for penalization and regularization, either through norm (i.e., least absolute shrinkage and selection operator, LASSO (Santosa and Symes,, 1986)), or norm (i.e., ridge adjustment (Hoerl and Kennard, 1970b, ; Hoerl and Kennard, 1970a, )), or both (i.e., elastic net regularization (Zou and Hastie,, 2005)), to automatically select the most important paths and/or shrink the estimates for highly correlated paths is an alternative solution to the issue of collinearity and overparameterization. For a specific research question, after the most important paths and then the most important mediated effects are identified, another possible future direction is to perform power analysis with these identified paths and more levels of sample size and provide some guidelines for future studies.
Concluding Remarks
To summarize, in this article, we propose two longitudinal mediation models to explore multiple nonlinear longitudinal processes. We can evaluate how the baseline covariate or the covariate process affects the outcome process through the mediator process with the proposed models. As discussed above, the proposed methods can be further extended in practice and further examined in methodology.
Appendix Appendix A Expressions of Additional Parameters in the First Proposed Model
For the first mediational model, we express the expected mean vector and the variance-covariance matrix of the mediator and outcome process for the individual as
and
where and are the conditional mean vector and variance-covariance matrix of the mediator’s growth factors given the covariate, which can be expressed as
(A.1) | |||
(A.2) |
in which and are the mean value and variance of the covariate. Additionally, and are the conditional mean vector and variance-covariance matrix of the growth factors of the outcome progress given the covariate and mediator progress
(A.3) | |||
(A.4) |
Appendix Appendix B Expressions of Additional Parameters in the Second Proposed Model
For the second mediational model, we express the expected mean vector and the variance-covariance matrix of the three longitudinal processes for individual as
and
where and are the mean vector and variance-covariance matrix of the growth factors of the covariate process, and are the conditional mean vector and variance-covariance matrix of the growth factors of the mediator given those of the covariate, which can be expressed as
(B.1) | |||
(B.2) |
Additionally, and are the conditional mean vector and variance-covariance matrix of the growth factors of the outcome progress given those of the covariate and mediator progress, which can be written as follows
(B.3) | |||
(B.4) |
Appendix Appendix C Data Generation Approach and Its Implication to Model Identification
There are two approaches to generate a data set for a regression model. Take the simple linear regression as an example. The first approach is to generate a predictor and then generate a conditional distribution of on with pre-specified coefficients. Alternatively, we can generate a joint distribution of and simultaneously. In the present article, we generated a joint distribution for the baseline predictor (or growth factor of the predictor process), growth factors of the mediator process, and the growth factors of the outcome process. Therefore, we simultaneously generate the mean vector and variance-covariance matrix for all parameters involved in the mediation process for each proposed model. One challenge of the approach is that we need to express the mean vector and variance-covariance matrix. This appendix provides the expressions of the mean vector and variance-covariance matrix for both models.
The parameters involved in the first mediation model include the mean and variance of the baseline predictor, the mean vector and variance-covariance matrix of the growth factors of the mediator process, and the mean vector and variance-covariance matrix of the growth factors of the outcome process. The mean vector of these parameters can be expressed as
where , , and have the same definitions as those in Equations A.1 and A.3. The variance-covariance matrix of these parameters is
(C.1) |
where , , and have the same definitions as those in Equations A.2 and A.4. Therefore, the joint distribution of all parameters involved in the first mediation longitudinal model follows a multivariate normal distribution .
The parameters involved in the second mediation model include the mean vector and variance-covariance matrix of the growth factors of the predictor process, the mean vector and variance-covariance matrix of the growth factors of the mediator process, and the mean vector and variance-covariance matrix of the growth factors of the outcome process. The mean vector of these parameters can be expressed as
where , , and have the same definitions as those in Equations B.1 and B.3. The variance-covariance matrix of these parameters is
(C.2) |
where , , and have the same definitions as those in Equations B.2 and B.4. Therefore, the joint distribution of all parameters involved in the second longitudinal mediation model follows a multivariate normal distribution .
One condition of generating the joint distribution of each model is that the corresponding variance-covariance matrix should be a positive definite matrix. According to the proposition regarding the positive definite property of the symmetric block matrix, conceptually, the variance-covariance matrices defined in Equations C.1 and C.2 may not be positive definite if we assign large values to the path coefficients , , or . This implies that the proposed models may suffer the non-convergence issue or fail to converge to global optima if these coefficients are large.
Baseline PredictorLongitudinal MediatorLongitudinal Outcome | ||
Indirect Effects | ||
Putative Mediator | Indirect Effects | Estimates |
Total Effects | ||
Total Effects | Estimates | |
Longitudinal PredictorLongitudinal MediatorLongitudinal Outcome | ||
Indirect Effects | ||
Putative Mediator | Indirect Effects | Estimates |
Total Effects | ||
Total Effects | Estimates | |
Criteria | Definition | Estimate |
Relative Bias | ||
Empirical SE | ||
Relative RMSE | ||
Coverage Probability |
-
a
: the estimate of from the replication
-
b
: the number of replications and set as in our simulation study
-
c
: the mean of ’s across replications
-
d
: an indicator function
Growth Factor Means | ||||
Para. | Reading Ability | Mathematics Ability | ||
Est. (SE) | P value | Est. (SE) | P value | |
() | () | |||
() | () | |||
() | () | |||
() | () | |||
Growth Factor (Unexplained) Variances | ||||
Para. | Reading Ability | Mathematics Ability | ||
Est. (SE) | P value | Est. (SE) | P value | |
() | () | |||
()) | () | |||
() | () | |||
Direct Effects | ||||
Para. | Covariate to Reading Ability | Covariate to Mathematics Ability | ||
Est. (SE) | P value | Est. (SE) | P value | |
() | () | |||
() | () | |||
() | () | |||
Para. | Growth Factors of Reading Ability to Those of Mathematics Ability | |||
— | — | Est. (SE) | P value | |
— | — | () | ||
— | — | () | ||
— | — | () | ||
— | — | () | ||
— | — | () | ||
— | — | () | ||
Para. | Indirect Effects | |||
— | — | Est. (SE) | P value | |
— | — | () | ||
— | — | () | ||
— | — | () | ||
— | — | () | ||
— | — | () | ||
— | — | () | ||
Para. | Total Effects | |||
— | — | Est. (SE) | P value | |
— | — | () | ||
— | — | () | ||
— | — | () |
Growth Factor Means | ||||||
Para. | Reading Ability | Mathematics Ability | Science Ability | |||
Est. (SE) | P value | Est. (SE) | P value | |||
() | () | () | ||||
() | () | () | ||||
() | () | () | ||||
() | () | () | ||||
Growth Factor (Unexplained) Variances | ||||||
Para. | Reading Ability | Mathematics Ability | ||||
Est. (SE) | P value | Est. (SE) | P value | |||
() | () | () | ||||
() | () | () | ||||
() | () | () | ||||
Direct Effects | ||||||
Para. | Growth Factors of Reading Ability to Those of Mathematics Ability | |||||
Est. (SE) | P value | — | — | — | — | |
() | — | — | — | — | ||
() | — | — | — | — | ||
() | — | — | — | — | ||
() | — | — | — | — | ||
() | — | — | — | — | ||
() | — | — | — | — | ||
Growth Factors of Reading/Mathematics Ability to Those of Science Ability | ||||||
Para. | Reading Ability | Mathematics Ability | ||||
Est. (SE) | P value | Est. (SE) | P value | — | — | |
() | () | — | — | |||
() | () | — | — | |||
() | () | — | — | |||
() | () | — | — | |||
() | () | — | — | |||
() | () | — | — | |||
Indirect Effects | ||||||
— | — | — | — | Est. (SE) | P value | |
— | — | — | — | () | ||
— | — | — | — | () | ||
— | — | — | — | () | ||
— | — | — | — | () | ||
— | — | — | — | () | ||
— | — | — | — | () | ||
— | — | — | — | () | ||
— | — | — | — | () | ||
— | — | — | — | () | ||
— | — | — | — | () | ||
Total Effects | ||||||
— | — | — | — | Est. (SE) | P value | |
— | — | — | — | () | ||
— | — | — | — | () | ||
— | — | — | — | () | ||
— | — | — | — | () | ||
— | — | — | — | () | ||
— | — | — | — | () |
References
- Akbaşlı et al., (2016) Akbaşlı, S., Şahin, M., and Yaykiran, Z. (2016). The effect of reading comprehension on the performance in science and mathematics. Journal of Education and Practice, 7:108–121.
- Baron and Kenny, (1986) Baron, R. M. and Kenny, D. A. (1986). The moderator-mediator variable distinction in social psychological research: conceptual, strategic, and statistical considerations. Journal of personality and social psychology, 52(6):1173–1182.
- Blozis, (2004) Blozis, S. A. (2004). Structured latent curve models for the study of change in multivariate repeated measures. Psychological Methods, 9(3):334–353.
- Blozis and Cho, (2008) Blozis, S. A. and Cho, Y. (2008). Coding and centering of time in latent curve models in the presence of interindividual time heterogeneity. Structural Equation Modeling: A Multidisciplinary Journal, 15(3):413–433.
- Blozis et al., (2008) Blozis, S. A., Harring, J. R., and Mels, G. (2008). Using lisrel to fit nonlinear latent curve models. Structural Equation Modeling: A Multidisciplinary Journal, 15(2):346–369.
- Boker et al., (2020) Boker, S. M., Neale, M. C., Maes, H. H., Wilde, M. J., Spiegel, M., Brick, T. R., Estabrook, R., Bates, T. C., Mehta, P., von Oertzen, T., Gore, R. J., Hunter, M. D., Hackett, D. C., Karch, J., Brandmaier, A. M., Pritikin, J. N., Zahery, M., and Kirkpatrick, R. M. (2020). OpenMx 2.17.2 User Guide.
- Bollen and Curran, (2005) Bollen, K. A. and Curran, P. J. (2005). Latent Curve Models: A Structural Equation Perspective. John Wiley & Sons, Inc.
- Cheong et al., (2003) Cheong, J., Mackinnon, D. P., and Khoo, S. T. (2003). Investigation of mediational processes using parallel process latent growth curve modeling. Structural equation modeling : a multidisciplinary journal, 10(2):238–262.
- Cheung and Lau, (2008) Cheung, G. W. and Lau, R. S. (2008). Testing mediation and suppression effects of latent variables: Bootstrapping with structural equation models. Organizational Research Methods, 11(2):296–325.
- Cheung, (2009) Cheung, M. W. L. (2009). Comparison of methods for constructing confidence intervals of standardized indirect effects. Behavior Research Methods, 41:425–438.
- Cohen, (1988) Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences (2nd Ed.). Lawrence Erlbaum Associates.
- Cole and Maxwell, (2003) Cole, D. A. and Maxwell, S. E. (2003). Testing mediational models with longitudinal data: questions and tips in the use of structural equation modeling. Journal of abnormal psychology, 112(4):558–577.
- Coulombe et al., (2015) Coulombe, P., Selig, J. P., and Delaney, H. D. (2015). Ignoring individual differences in times of assessment in growth curve modeling. International Journal of Behavioral Development, 40(1):76–86.
- Cudeck and du Toit, (2003) Cudeck, R. and du Toit, S. H. C. (2003). Nonlinear multilevel models for repeated measures data. In Reise, S. P. and Duan, N., editors, Multilevel Modeling : Methodological Advances, Issues, and Applications., Multivariate Applications Book Series, chapter 2, pages 1–24. Psychology Press.
- Dominicus et al., (2008) Dominicus, A., Ripatti, S., Pedersen, N. L., and Palmgren, J. (2008). A random change point model for assessing variability in repeated measures of cognitive function. Statistics in Medicine, 27(27):5786–5798.
- Dumenci et al., (2019) Dumenci, L., Perera, R. A., Keefe, F. J., Ang, D. C., J., S., Jensen, M. P., and Riddle, D. L. (2019). Model-based pain and function outcome trajectory types for patients undergoing knee arthroplasty: a secondary analysis from a randomized clinical trial. Osteoarthritis and cartilage, 27(6):878–884.
- Flora, (2008) Flora, D. B. (2008). Specifying piecewise latent trajectory models for longitudinal data. Structural Equation Modeling: A Multidisciplinary Journal, 15(3):513–533.
- Gollob and Reichardt, (1987) Gollob, H. and Reichardt, C. (1987). Taking account of time lags in causal models. Child Development, 58(1):80–92.
- Grimm et al., (2016) Grimm, K. J., Ram, N., and Estabrook, R. (2016). Growth Modeling: Structural Equation and Multilevel Modeling Approaches. Guilford Press.
- Harring et al., (2006) Harring, J. R., Cudeck, R., and du Toit, S. H. C. (2006). Fitting partially nonlinear random coefficient models as sems. Multivariate Behavioral Research, 41(4):579–596.
- Hayes, (2009) Hayes, A. F. (2009). Beyond baron and kenny: Statistical mediation analysis in the new millennium. Communication Monographs, 76(4):408–420.
- (22) Hoerl, A. E. and Kennard, R. W. (1970a). Ridge regression: Applications to nonorthogonal problems. Technometrics, 12(1):69–82.
- (23) Hoerl, A. E. and Kennard, R. W. (1970b). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12(1):55–67.
- Hunter, (2018) Hunter, M. D. (2018). State space modeling in an open source, modular, structural equation modeling environment. Structural Equation Modeling: A Multidisciplinary Journal, 25(2):307–324.
- Kohli, (2011) Kohli, N. (2011). Estimating unknown knots in piecewise linear-linear latent growth mixture models. PhD thesis, University of Maryland.
- Kohli and Harring, (2013) Kohli, N. and Harring, J. R. (2013). Modeling growth in latent variables using a piecewise function. Multivariate Behavioral Research, 48(3):370–397.
- Kohli et al., (2013) Kohli, N., Harring, J. R., and Hancock, G. R. (2013). Piecewise linear-linear latent growth mixture models with unknown knots. Educational and Psychological Measurement, 73(6):935–955.
- Kohli et al., (2015) Kohli, N., Hughes, J., Wang, C., Zopluoglu, C., and Davison, M. L. (2015). Fitting a linear-linear piecewise growth mixture model with unknown knots: A comparison of two common approaches to inference. Psychological Methods, 20(2):259–275.
- Kwok et al., (2010) Kwok, O., Luo, W., and West, S. G. (2010). Using modification indexes to detect turning points in longitudinal data: A monte carlo study. Structural Equation Modeling: A Multidisciplinary Journal, 17(2):216–240.
- Lê et al., (2011) Lê, T., Norman, G., Tourangeau, K., Brick, J. M., and Mulligan, G. (2011). Early childhood longitudinal study: Kindergarten class of 2010-2011 – sample design issues. In JSM Proceedings 2011, pages 1629–1639. Alexandria, VA: American Statistical Association.
- Liu, (2019) Liu, J. (2019). Estimating Knots in Bilinear Spline Growth Models with Time-invariant Covariates in the Framework of Individual Measurement Occasions. PhD thesis, Virginia Commonwealth University.
- Liu and Perera, (2021) Liu, J. and Perera, R. A. (2021). Estimating knots and their association in parallel bilinear spline growth curve models in the framework of individual measurement occasions. Psychological Methods (Advance online publication).
- (33) Liu, J., Perera, R. A., Kang, L., Kirkpatrick, R. M., and Sabo, R. T. (2019a). Obtaining interpretable parameters from reparameterized longitudinal models: transformation matrices between growth factors in two parameter-spaces.
- (34) Liu, J., Perera, R. A., Kang, L., Sabo, R. T., and Kirkpatrick, R. M. (2019b). Hybridizing two-step growth mixture model and exploratory factor analysis to examine heterogeneity in nonlinear trajectories.
- Lock et al., (2018) Lock, E. F., Kohli, N., and Bose, M. (2018). Detecting multiple random changepoints in bayesian piecewise growth mixture models. Psychometrika, 83(3):733–750.
- MacKinnon, (2008) MacKinnon, D. P. (2008). Introduction to statistical mediation analysis. Taylor & Francis Group/Lawrence Erlbaum Associates., New York, fourth edition.
- MacKinnon et al., (2000) MacKinnon, D. P., Krull, J. L., and Lockwood, C. M. (2000). Equivalence of the mediation, confounding and suppression effect. Prevention Science, 1:173–181.
- MacKinnon et al., (2002) MacKinnon, D. P., Lockwood, C. M., Hoffman, J. M., West, S. G., and Sheets, V. (2002). A comparison of methods to test mediation and other intervening variable effects. Psychological Methods, 7(1):83–104.
- Maxwell and Cole, (2007) Maxwell, S. E. and Cole, D. A. (2007). Bias in cross-sectional analyses of longitudinal mediation. Psychological methods, 12(1):23–44.
- McArdle and Nesselroade, (1994) McArdle, J. J. and Nesselroade, J. R. (1994). Using multivariate data to structure developmental change. In Cohen, S. H. and Reese, H. W., editors, Life-span developmental psychology: Methodological contributions, page 223–267. Lawrence Erlbaum Associates, Inc.
- McArdle and Wang, (2008) McArdle, J. J. and Wang, L. (2008). Modeling age-based turning points in longitudinal life-span growth curves of cognition. In Cohen, P., editor, Applied data analytic techniques for turning points research, Multivariate applications series., chapter 2, pages 1–24. Routledge Taylor & Francis Group.
- Mehta and Neale, (2005) Mehta, P. D. and Neale, M. C. (2005). People are variables too: Multilevel structural equations modeling. Psychological Methods, 10(3):259–284.
- Mehta and West, (2000) Mehta, P. D. and West, S. G. (2000). Putting the individual back into individual growth curves. Psychological Methods, 5(1):23–43.
- Morris et al., (2019) Morris, T. P., White, I. R., and Crowther, M. J. (2019). Using simulation studies to evaluate statistical methods. Statistics in Medicine, 38(11):2074–2102.
- Muniz Terrera et al., (2011) Muniz Terrera, G., van den Hout, A., and Matthews, F. E. (2011). Random change point models: investigating cognitive decline in the presence of missing data. Journal of Applied Statistics., 38(4):705–716.
- Muthén and Muthén, (2017) Muthén, L. K. and Muthén, B. O. (2017). Mplus: Statistical Analysis with Latent Variables: User’s Guide (Version 8).
- Neale et al., (2016) Neale, M. C., Hunter, M. D., Pritikin, J. N., Zahery, M., Brick, T. R., Kirkpatrick, R. M., Estabrook, R., Bates, T. C., Maes, H. H., and Boker, S. M. (2016). OpenMx 2.0: Extended structural equation and statistical modeling. Psychometrika, 81:535–549.
- Peralta et al., (2020) Peralta, Y., Kohli, N., Lock, E. F., and Davison, M. L. (2020). Bayesian modeling of associations in bivariate piecewise linear mixed-effects models. Psychological Methods (Advance online publication).
- Preacher and Hancock, (2015) Preacher, K. J. and Hancock, G. R. (2015). Meaningful aspects of change as novel random coefficients: A general method for reparameterizing longitudinal models. Psychological Methods, 20(1):84–101.
- Pritikin et al., (2015) Pritikin, J. N., Hunter, M. D., and Boker, S. M. (2015). Modular open-source software for Item Factor Analysis. Educational and Psychological Measurement, 75(3):458–474.
- Riddle et al., (2015) Riddle, D. L., Perera, R. A., Jiranek, W. A., and Dumenci, L. (2015). Using surgical appropriateness criteria to examine outcomes of total knee arthroplasty in a united states sample. Arthritis care & research., 67(3):349–357.
- Santosa and Symes, (1986) Santosa, F. and Symes, W. W. (1986). Linear inversion of band-limited reflection seismograms. SIAM Journal on Scientific and Statistical Computing, 7(4):1307–1330.
- Selig and Preacher, (2009) Selig, J. P. and Preacher, K. J. (2009). Mediation models for longitudinal data in developmental research. Research in Human Development, 6(2-3):144–164.
- Shrout and Bolger, (2002) Shrout, P. E. and Bolger, N. (2002). Mediation in experimental and nonexperimental studies: New procedures and recommendations. Psychological Methods, 7(4):422–445.
- Sobel, (1982) Sobel, M. E. (1982). Asymptotic confidence intervals for indirect effects in structural equation models. Sociological Methodology, 13:290–312.
- Sobel, (1986) Sobel, M. E. (1986). Some new results on indirect effects and their standard errors in covariance structural models. Sociological Methodology, 16:159–186.
- Soest and Hagtvet, (2011) Soest, T. and Hagtvet, K. A. (2011). Mediation analysis in a latent growth curve modeling framework. Structural equation modeling : a multidisciplinary journal, 18(2):289–314.
- Sterba, (2014) Sterba, S. K. (2014). Fitting nonlinear latent growth curve models with individually varying time points. Structural Equation Modeling: A Multidisciplinary Journal, 21(4):630–647.
- Torgesen, (2002) Torgesen, J. K. (2002). The prevention of reading difficulties. Journal of School Psychology, 40(1):7–26.
- Tourangeau et al., (2013) Tourangeau, K., Nord, C., Lê, T., Sorongon, A. G., Hagedorn, M. C., Daly, P., and Westat (2013). User’s manual for the ECLS-K:2011 Kindergarten data file and electronic codebook.
- Venables and Ripley, (2002) Venables, W. N. and Ripley, B. D. (2002). Modern Applied Statistics with S. Springer, New York, fourth edition.
- Wang and McArdle, (2008) Wang, L. and McArdle, J. J. (2008). A simulation study comparison of bayesian estimation with conventional methods for estimating unknown change points. Structural Equation Modeling: A Multidisciplinary Journal, 15(1):52–74.
- Wasserstein et al., (2019) Wasserstein, R. L., Schirm, A. L., and Lazar, N. A. (2019). Moving to a world beyond ‘p<0.05’. The American Statistician, 73(sup1):1–19.
- Zou and Hastie, (2005) Zou, H. and Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society, Series B., 67(2):301–320.

















