Smoothing spline analysis of variance models: A new tool for the analysis of accelerometer data
Abstract
Accelerometer data is commonplace in physical activity research, exercise science, and public health studies, where the goal is to understand and compare physical activity differences between groups and/or subject populations, and to identify patterns and trends in physical activity behavior to inform interventions for improving public health. We propose using mixed-effects smoothing spline analysis of variance (SSANOVA) as a new tool for analyzing accelerometer data. By representing data as functions or curves, smoothing spline allows for accurate modeling of the underlying physical activity patterns throughout the day, especially when the accelerometer data is continuous and sampled at high frequency. The SSANOVA framework makes it possible to decompose the estimated function into the portion that is common across groups (i.e., the average activity) and the portion that differs across groups. By decomposing the function of physical activity measurements in such a manner, we can estimate group differences and identify the regions of difference. In this study, we demonstrate the advantages of utilizing SSANOVA models to analyze accelerometer-based physical activity data collected from community-dwelling older adults across various fall risk categories. Using Bayesian confidence intervals, the SSANOVA results can be used to reliably quantify physical activity differences between fall risk groups and identify the time regions that differ throughout the day.
Keywords: Smoothing spline ANOVA, Functional data analysis, Accelerometer data, Physical activity, Mobile health, and Wearable devices.
1 Introduction
Accelerometer data collected from wearable devices for monitoring daily physical activity is commonly used in physical activity research, exercise science, and fall risk intervention studies[1, 2]. Physical activity is an integral component of a healthy lifestyle, particularly for older adults. Extensive research has highlighted the importance of physical activity for this population over a prolonged period of time [3, 4, 5]. It is essential to promote and support the integration of regular physical activity in older adults. Federal physical activity guidelines recommend that older adults engage in a variety of physical activities, including balance, aerobic, and muscle-strengthening exercises [6]. Regular physical activity has been shown to reduce the risk of chronic diseases in older adults, such as heart disease, stroke, diabetes, and certain forms of cancer [7]. Additionally, physical activity can help manage pre-existing conditions like high blood pressure, arthritis, and osteoporosis [8]. The World Health Organization recommends that older adults engage in at least 150 minutes of moderate or 75 minutes of vigorous physical activity per week to maintain their health and well-being [9].
Precisely evaluating physical activity levels among older adults is crucial to identify individuals who require interventions to improve their activity levels and prevent functional decline. Wearable devices that use accelerometers provide a vital tool for accurately tracking physical activity. Unlike self-reported data, accelerometer data offers an objective measurement of physical activity levels that is more accurate and reliable. By recording continuous measurements of physical activity levels over several days or weeks, accelerometers provide a more comprehensive view of an individual’s physical activity patterns, including the frequency, intensity, and duration of physical activity. These devices produce quantitative measures of physical activity such as step count, moderate-to-vigorous physical activity (MVPA), and overall physical activity volume, which can be used to establish goals and monitor progress over time. Standardized methods for measuring physical activity levels allow for comparisons between individuals and populations, making it easier to identify those at risk of physical inactivity and to develop targeted interventions. The small, wearable design of accelerometer devices makes them comfortable and practical for older adults to use throughout the day. However, accelerometer data can be complex and require substantial data processing to be usable, while the quality of the data can be affected by various factors such as device placement, calibration, and wear compliance. Furthermore, statistical analysis of accelerometer data can be challenging, requiring the analysis of multivariate and longitudinal data, as well as handling missing data and non-normal distributions.
The accelerometer data for physical activity assessment are characterized by significant noise and a large number of time points, which renders them difficult to interpret. While averaged curves across individuals have been utilized to visualize overall patterns, the high uncertainty associated with these curves suggests their limited utility in practice [10]. Various statistical methods have been explored to analyze accelerometer data. More objective methods are direct observation by an independent observer and real-time recording by devices including pedometers [11, 12], heart-rate monitors [13, 14], armbands [15], and accelerometers [16, 17, 18]. Pedometers are cost-efficient and convenient [19] although they do not measure the duration or intensity of PA. A recent review article discussed the statistical methods for analyzing the accelerometer data [20]. The functional data analysis framework has been used for the minute-by-minute temporal pattern analysis of physical activity [21, 22, 23]. Nevertheless, there is still a gap in the statistical approach that takes into account the ANOVA-type decomposition when comparing accelerometer data across multiple groups.
To address these challenges, we propose to use a mixed-effects smoothing spline analysis of variance (SSANOVA) framework [24, 25, 26, 27] to analyze the accelerometer data. By utilizing the smoothing spline ANOVA model, raw accelerometer data can be processed through smoothing functions to minimize noise and enhance data quality. Additionally, this model can facilitate the identification of intricate associations between physical activity levels and variables such as age, gender, and health status. Due to its flexibility and capability to handle complex and non-linear relationships, the smoothing spline ANOVA model is a potential statistical tool that can provide precise and dependable estimates of physical activity levels and their links to health outcomes. The smoothing spline model and its applications have been recently reviewed, and the relevant literature can be accessed through Zhang’s work [28] and the references cited therein. The SSANOVA model has been implemented in various studies across diverse fields. For instance, Davidson et al. [29] employed SSANOVA to compare whole tongue contours obtained via ultrasound imaging. Meanwhile, Luo and Wahba [30] utilized the model to generate smoothing spline estimates for temperature data collected at multiple time points, taking into account the effects of year and location. Helwig and colleagues [31] applied the SSANOVA framework to estimate spatiotemporal trends and assess uncertainty using massive samples of social media data. Additionally, the cyclic biomechanical data analysis was conducted using the SSANOVA model [10].
In our study, we collected accelerometer data for the physical activity assessment of 121 individuals over the age of 60 for a period of 7 consecutive days, utilizing wearable activity trackers [32]. We employed ActiGraph accelerometers (ActiGraph GT9X Link wireless, ActiGraph LLC.) to collect the data, offering an impartial evaluation of physical activity that is not dependent on self-reporting. Our aim was to use a mixed-effects smoothing spline analysis of variance (SSANOVA) framework to analyze accelerometer data, gaining a better understanding of older adults’ physical activity levels and promoting the importance of implementing balance-improving interventions and cognitive restructuring to encourage physical activity among older adults.
The structure of this paper is as follows: Section 2 outlines the smoothing spline analysis of variance model. Section 3 presents the descriptive statistics of accelerometer data and demographics of older adults. In this section, we also apply mixed-effects SSANOVA models to estimate the smooth functions from the noisy measurements obtained from the Actigraph device. Section 4 provides a conclusion and discusses future directions.
2 Smoothing Spline Analysis of Variance (SSANOVA)
We propose a mixed-effects SSANOVA model of the form
(1) |
where is the recorded physical activity at time point , is the fixed effect predictor vector corresponding to the -th data point with denoting the time point and denoting the groups, is the subject indicator, is the subject random effect such that , and is random error that is independent of the random effects, for where is the total number of data points after vectorizing the data.
The rationale behind the SSANOVA is that the smooth function is decomposed into components associated with time and group member as
(2) |
where is a constant function, is the main effect of time , is the main effect of the group, and is the time-group interaction effect. The SSANOVA model makes it possible to examine the physical activity difference between two groups
(3) |
for different groups and . The functional differences quantify the differences between the physical activity of groups and across the entire study period, and can be used to identify time regions where functional differences exist between the groups, instead of the traditional pointwise comparisons for the statistical analysis of accelerometer data.
The function is estimated by minimizing the penalized least squares [24],
(4) |
where the smoothness penalty term is nonnegative definite. The model’s optimal smoothing parameters were estimated using the generalized cross-validation method [33].
We can construct the Bayesian confidence intervals (CIs) for the estimated function and via the Bayesian interpretation of a smoothing spline, where the significant functional difference can be determined and uncertainty of the estimation can be quantified [34, 35]. Thus we utilize and CIs to test for differences between the groups and identify the time regions of statistically significant differences in physical activities. Time points or region where the confidence interval of does not include zero are considered to demonstrate significant mean differences between the groups. SSANOVA models hence provide an accurate and powerful tool to identify the time regions of statistically significant differences in physical activity between groups.
3 Application of Smoothing Spline ANOVA to Accelerometer Data
3.1 Accelerometer Data
We summarize the demographic information, fall-related variables and physical activity measurement vector magnitude (VM) for the 121 participants. The participants were divided into four groups based on their physical abilities and fall risk appraisal: rational, irrational, congruent, and incongruent [32]. The rational group consists of individuals with normal balance and a normal perceived risk of falls. The irrational group has normal balance ability, but with higher perceived risk of falls. Both the congruent and incongruent groups have poor balance, but congruent group exhibits a low perceived risk of falls, while the incongruent group presents a high perceived risk of falls
Table 1 reports summary statistics of age, gender, history of falling, and history of injurious falls, VM, time by fall risk groups, where the VM records were aggregated to one-minute resolution. Figure 1 depicts the daily physical activity represented as vector magnitude (VM) for the congruent group participants. The sample average over time (Figure 1, grey) shows considerable fluctuation across individuals, while SSANOVA curve (Figure 1, blue) more clearly visualize overall daily physical activity patterns.
Variable | Mean | Std | Min | Median | Max | Mean | Std | Min | Median | Max | |
Congruent Group () | Irrational Group () | ||||||||||
VM | 952.0 | 1789.4 | 0.0 | 0.0 | 32439.0 | 971.9 | 1850.5 | 0.0 | 0.0 | 32223.0 | |
Time | 1177.9 | 691.8 | 0.0 | 1157.0 | 2359.0 | 1180.9 | 691.6 | 0.0 | 1156.0 | 2359.0 | |
Age | 78.3 | 7.1 | 67.0 | 79.0 | 93.0 | 74.0 | 7.4 | 60.0 | 75.0 | 90.0 | |
Female | |||||||||||
History of Falls | |||||||||||
History of Injurious Falls | |||||||||||
Incongruent Group () | Rational Group () | ||||||||||
VM | 887.9 | 1797.2 | 0.0 | 0.0 | 37425.0 | 907.6 | 1956.8 | 0.0 | 0.0 | 31737.0 | |
Time | 1177.1 | 691.8 | 0.0 | 1200.0 | 2359.0 | 1178.5 | 692.1 | 0.0 | 1157.0 | 2359.0 | |
Age | 77.6 | 6.3 | 68.0 | 76.0 | 94.0 | 72.2 | 6.3 | 60.0 | 71.0 | 96.0 | |
Female | |||||||||||
History of Falls | |||||||||||
History of Injurious Falls |

3.2 SSANOVA by Fall Risk Groups for Daily Physical Activity
Our objective was to estimate the functional trends of physical activity for these four groups and investigate the variability of the estimated trends [36, 37]. Using the mixed-effects smoothing spline ANOVA model proposed in Eq. 1, we fit the log-transformed VM on time and fall risk group . The SSANOVA models are fit using gss package [38] in R environment [39]. The cubic smoothing spline was used for daily (in minutes) , and fall risk group (nominal variable: rational, irrational, congruent, and incongruent). We estimate the variances of the random subject effects and the random errors (i.e., and ) from the data using a restricted maximum likelihood approach[24, 40]. The optimal values of smoothing parameters in the models were chosen using the generalized cross-validation method.
Figure 2 plots the estimated functions of daily VM along with CIs by fall risk group. The corresponding model has total and estimated error variance . Compared to the sample average, the SSANOVA model takes into account both the within and between subject variability in the data, leading to significant reductions in standard errors and thus narrower CIs. The Bayesian confidence interval indicates that the rational group had the highest physical activity (PA) level while the congruent group had significantly lower PA levels than the other three groups. Additionally, there was a noticeable periodic trend in PA levels observed in all study samples regardless of their fall risk groups, with inactivity during the night (12-5am), an increasing trend in the morning (5am-noon), a peak at noon, a moderate decrease in the afternoon (noon-6pm), and a significant decline in the evening.

3.3 SSANOVA for the History of Falls and Injurious Falls








If we take the history of falls and history of injurious falls into account as additive functions, the SSANOVA model becomes
where the groups with and without history of falls and groups with and without history of injurious falls are compared via and , respectively, using dummy variables as inputs. Figure 3 provides the estimated SSANOVA curves for the difference in physical activity intensity across 24 hours between the groups with and without history of falls. The CIs for is also estimated (grey bands in Figure 3). The results indicate that the group of participants with history of falls have significantly higher physical activity intensity during the daily activity (6 am - 10 pm) compared to those without history of falls, regardless of the fall risk groups, since the CIs in Figure 3 are below 0 reference line from 6 am to 10 pm.
Similarly, Figure 4 provides the estimated SSANOVA curves for the difference in physical activity intensity and CIs (blue bands) across 24 hours between the groups with and without history of injurious falls. For rational group, participants with history of injurious falls have significantly higher physical activity intensity during early morning (3 - 6 am) compared to those only with history of falls but no injurious falls. For congruent, incongruent and irrational groups, similar pattern observed with much narrow significant time region.
4 Conclusion
The paper utilizes SSANOVA models on accelerometer data to estimate physical activity patterns in community-dwelling older adults and differences among fall risk groups. The analysis reveals distinct patterns of activity levels throughout the day and across days, with a noticeable periodic trend in physical activity, including inactivity during night (12-5am), an increasing trend in the morning (5am-noon), a peak at noon, a moderate decrease in the afternoon (noon-6pm), and a significant decline in the evening. The study also finds that the group-specific physical activity patterns within a day. In particular, the congruent group has significantly lower physical activity levels than the other three groups, while the rational group has the highest physical activity levels, based on Bayesian confidence intervals. These results offer important insights into the daily and longitudinal activity patterns of different groups that can be used to tailor interventions based on the fall risk groups, balance performance, fear of falling levels and time preference. Moreover, the results can inform interventions to enhance physical activity levels and improve overall health outcomes for older adults.
Additionally, this study investigates the daily physical activity levels and pattern between history of falls, and injurious falls. The outcomes reveal noticeable variations in physical activity levels between groups with history of falls or not. Additionally, individuals with a history of injurious falls exhibit higher levels of physical activity compared to those without such injuries, especially in the early morning (12 - 6 am).
As SSANOVA models provide a powerful framework for analyzing cyclic or periodic data, future research can explore their potential in analyzing accelerometer data to identify more complex patterns of physical activity. Overall, the use of SSANOVA models in accelerometer data analysis holds great potential for advancing our understanding of physical activity patterns and their association with health outcomes in older adults, which can inform the development of targeted interventions to improve their overall health and well-being.
One potential future research direction using SSANOVA models on analyzing accelerometer data could be to focus on mobile health and smart intervention using wearable devices. This could involve using SSANOVA models to better understand patterns of physical activity and sedentary behavior in real-time, as well as how these patterns relate to health outcomes such as obesity, diabetes, and cardiovascular disease. Additionally, SSANOVA models could be used to develop personalized intervention strategies based on an individual’s unique activity patterns, with the goal of promoting healthy behavior change and preventing disease. Another potential direction could be to incorporate additional sensors, such as heart rate monitors or GPS trackers, to better understand the context in which physical activity occurs and how it relates to health outcomes. Overall, the use of SSANOVA models in mobile health and intervention research has the potential to provide valuable insights into how technology can be leveraged to improve health outcomes and promote healthy behavior change.
References
- [1] L. Thiamwong, R. Xie, J.-H. Park, N. Lighthall, V. Loerzel, and J. Stout, “A technology-based body-mind intervention for low-income american older adults,” Innovation in Aging, vol. 6, no. Supplement_1, pp. 273–273, 2022.
- [2] R. Choudhury, J.-H. Park, L. Thiamwong, R. Xie, J. R. Stout, et al., “Objectively measured physical activity levels and associated factors in older us women during the covid-19 pandemic: Cross-sectional study,” JMIR aging, vol. 5, no. 3, p. e38172, 2022.
- [3] R. A. Seguin, J. N. Epping, D. Buchner, R. Bloch, and M. E. Nelson, Growing stronger: strength training for older adults. US Department of Health, 2002.
- [4] M. E. Nelson, W. J. Rejeski, S. N. Blair, P. W. Duncan, J. O. Judge, A. C. King, C. A. Macera, and C. Castaneda-Sceppa, “Physical activity and public health in older adults: recommendation from the american college of sports medicine and the american heart association,” Circulation, vol. 116, no. 9, p. 1094, 2007.
- [5] W. J. Evans, “Exercise training guidelines for the elderly.,” Medicine and science in sports and exercise, vol. 31, no. 1, pp. 12–17, 1999.
- [6] K. L. Piercy, R. P. Troiano, R. M. Ballard, S. A. Carlson, J. E. Fulton, D. A. Galuska, S. M. George, and R. D. Olson, “The physical activity guidelines for americans,” JAMA, vol. 320, no. 19, pp. 2020–2028, 2018.
- [7] M. V. Chakravarthy, M. J. Joyner, and F. W. Booth, “An obligation for primary care physicians to prescribe physical activity to sedentary patients to reduce the risk of chronic health conditions,” in Mayo Clinic Proceedings, vol. 2, pp. 165–173, 2002.
- [8] M. Izquierdo, R. Merchant, J. Morley, S. Anker, I. Aprahamian, H. Arai, M. Aubertin-Leheudre, R. Bernabei, E. Cadore, M. Cesari, et al., “International exercise recommendations in older adults (icfsr): expert consensus guidelines,” The journal of nutrition, health & aging, vol. 25, no. 7, pp. 824–853, 2021.
- [9] W. H. Organization, Global recommendations on physical activity for health. World Health Organization, 2010.
- [10] N. E. Helwig, K. A. Shorter, P. Ma, and E. T. Hsiao-Wecksler, “Smoothing spline analysis of variance models: A new tool for the analysis of cyclic biomechanical data,” Journal of Biomechanics, vol. 49, no. 14, pp. 3216–3222, 2016.
- [11] S. Crouter, “Validity of 10 electronic pedometers for measuring steps, distance, and energy cost,” Med Sci Sports Exerc., vol. 36, pp. 331–335, 2004.
- [12] M. Karabulut, S. E. Crouter, and D. R. Bassett, “Comparison of two waist-mounted and two ankle-mounted electronic pedometers,” European journal of applied physiology, vol. 95, no. 4, pp. 335–343, 2005.
- [13] S. E. Crouter, C. Albright, D. R. Bassett, et al., “Accuracy of polar s410 heart rate monitor to estimate energy cost of exercise,” Medicine and science in sports and exercise, vol. 36, pp. 1433–1439, 2004.
- [14] S. Brage, N. Brage, P. W. Franks, U. Ekelund, and N. J. Wareham, “Reliability and validity of the combined heart rate and movement sensor actiheart,” European journal of clinical nutrition, vol. 59, no. 4, pp. 561–570, 2005.
- [15] D. Andre, R. Pelletier, J. Farringdon, S. Safier, W. Talbott, R. Stone, N. Vyas, J. Trimble, D. Wolf, S. Vishnubhatla, et al., “The development of the sensewear® armband, a revolutionary energy assessment device to assess physical activity and lifestyle,” BodyMedia Inc, 2006.
- [16] P. M. Grant, C. G. Ryan, W. W. Tigbe, and M. H. Granat, “The validation of a novel activity monitor in the measurement of posture and motion during everyday activities,” British journal of sports medicine, vol. 40, no. 12, pp. 992–997, 2006.
- [17] J. Vanhelst, J. Mikulovic, G. Bui-Xuan, O. Dieu, T. Blondeau, P. Fardy, and L. Béghin, “Comparison of two actigraph accelerometer generations in the assessment of physical activity in free living conditions,” BMC research notes, vol. 5, no. 1, pp. 1–4, 2012.
- [18] A. G. Bonomi, G. Plasqui, A. H. Goris, and K. R. Westerterp, “Estimation of free-living energy expenditure using a novel activity monitor designed to minimize obtrusiveness,” Obesity, vol. 18, no. 9, pp. 1845–1851, 2010.
- [19] K. B. Adamo, S. A. Prince, A. C. Tricco, S. Connor-Gorber, and M. Tremblay, “A comparison of indirect versus direct measures for assessing physical activity in the pediatric population: a systematic review,” International Journal of Pediatric Obesity, vol. 4, no. 1, pp. 2–27, 2009.
- [20] Y. Zhang, H. Li, S. K. Keadle, C. E. Matthews, and R. J. Carroll, “A review of statistical analyses on physical activity data collected from accelerometers,” Statistics in biosciences, vol. 11, pp. 465–476, 2019.
- [21] J. A. Schrack, V. Zipunnikov, J. Goldsmith, J. Bai, E. M. Simonsick, C. Crainiceanu, and L. Ferrucci, “Assessing the “physical cliff”: detailed quantification of age-related differences in daily patterns of physical activity,” Journals of Gerontology Series A: Biomedical Sciences and Medical Sciences, vol. 69, no. 8, pp. 973–979, 2014.
- [22] J. Goldsmith, V. Zipunnikov, and J. Schrack, “Generalized multilevel function-on-scalar regression and principal component analysis,” Biometrics, vol. 71, no. 2, pp. 344–353, 2015.
- [23] H. Li, Y. Zhang, R. J. Carroll, S. K. Keadle, J. N. Sampson, and C. E. Matthews, “A joint modeling and estimation method for multivariate longitudinal data with mixed types of responses to analyze physical activity data generated by accelerometers,” Statistics in medicine, vol. 36, no. 25, pp. 4028–4040, 2017.
- [24] C. Gu, Smoothing spline ANOVA models, vol. 297. Springer, 2013.
- [25] G. Wahba, Spline models for observational data. SIAM, 1990.
- [26] Y. Wang, Smoothing splines: methods and applications. CRC press, 2011.
- [27] C. Gu and P. Ma, “Optimal smoothing in nonparametric mixed-effect models,” Annals of statistics, vol. 33, no. 3, pp. 1357–1379, 2005.
- [28] J. Zhang, H. Jin, Y. Wang, X. Sun, P. Ma, and W. Zhong, “Smoothing spline anova models and their applications in complex and massive datasets,” Topics in Splines and Applications, vol. 63, no. 8, 2018.
- [29] L. Davidson, “Comparing tongue shapes from ultrasound imaging using smoothing spline analysis of variance,” The Journal of the Acoustical Society of America, vol. 120, no. 1, pp. 407–415, 2006.
- [30] Z. Luo, G. Wahba, and D. R. Johnson, “Spatial–temporal analysis of temperature using smoothing spline anova,” Journal of Climate, vol. 11, no. 1, pp. 18–28, 1998.
- [31] N. E. Helwig, Y. Gao, S. Wang, and P. Ma, “Analyzing spatiotemporal trends in social media data via smoothing spline analysis of variance,” Spatial Statistics, vol. 14, pp. 491–504, 2015.
- [32] L. Thiamwong, J. R. Stout, J.-H. Park, and X. Yan, “Technology-based fall risk assessments for older adults in low-income settings: protocol for a cross-sectional study,” JMIR research protocols, vol. 10, no. 4, p. e27381, 2021.
- [33] P. Craven and G. Wahba, “Smoothing noisy data with spline functions: estimating the correct degree of smoothing by the method of generalized cross-validation,” Numerische mathematik, vol. 31, no. 4, pp. 377–403, 1978.
- [34] G. Wahba, “Bayesian “confidence intervals” for the cross-validated smoothing spline,” Journal of the Royal Statistical Society: Series B (Methodological), vol. 45, no. 1, pp. 133–150, 1983.
- [35] C. Gu and G. Wahba, “Smoothing spline anova with component-wise bayesian “confidence intervals”,” Journal of Computational and Graphical Statistics, vol. 2, no. 1, pp. 97–117, 1993.
- [36] L. Thiamwong, M. L. Sole, B. P. Ng, G. F. Welch, H. J. Huang, and J. R. Stout, “Assessing fall risk appraisal through combined physiological and perceived fall risk measures using innovative technology,” Journal of gerontological nursing, vol. 46, no. 4, pp. 41–47, 2020.
- [37] R. Choudhury, J.-H. Park, C. Banarjee, L. Thiamwong, R. Xie, and J. R. Stout, “Associations of mutually exclusive categories of physical activity and sedentary behavior with body composition and fall risk in older women: a cross-sectional study,” International journal of environmental research and public health, vol. 20, no. 4, p. 3595, 2023.
- [38] C. Gu, “Smoothing spline anova models: R package gss,” Journal of Statistical Software, vol. 58, pp. 1–25, 2014.
- [39] R Core Team, R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2022.
- [40] N. E. Helwig, “Efficient estimation of variance components in nonparametric mixed-effects models with large samples,” Statistics and Computing, vol. 26, pp. 1319–1336, 2016.