Health Guardian: Using Multi-modal Data to Understand Individual Health

Vince S. Siu17, Kuan Yu Hsieh1567, Italo Buleje17, Takashi Itoh2, Tian Hao1,
Ben Civjan3, Nigel Hinds4, Bing Dang18, Jeffrey L. Rogers18, Bo Wen18
1Digital Health, IBM T.J. Watson Research Center, Yorktown Heights, NY USA
2Digital Health, IBM Research, Tokyo, Japan
3Illinois Computer Science, University of Illinois Urbana-Champaign, Urbana-Champaign, IL, USA
4Emerging Technology Engineering, IBM T.J. Watson Research Center, Yorktown Heights, USA
5Department of Electrical and Computer Engineering, College of Electrical and Computer Engineering,
National Yang Ming Chiao Tung University, Hsinchu 30010, Taiwan
6Institute of Biomedical Engineering, College of Electrical and Computer Engineering,
National Yang Ming Chiao Tung University, Hsinchu 30010, Taiwan
Emails: [{vssiu, ibuleje, nhinds, dangbing, jeffrogers, bwen}@us., kuan.yu@, jl03313@jp.]ibm.com
{haotianrock, bencivjan}@gmail.com
7Equal contribution 8Principal investigators

Abstract

Artificial intelligence (AI) has shown great promise in revolutionizing the field of digital health by improving disease diagnosis, treatment, and prevention. This paper describes the Health Guardian platform, a non-commercial, scientific research-based platform developed by the IBM Digital Health team to rapidly translate AI research into cloud-based microservices. The platform can collect health-related data from various digital devices, including wearables and mobile applications. Its flexible architecture supports microservices that accept diverse data types such as text, audio, and video, expanding the range of digital health assessments and enabling holistic health evaluations by capturing voice, facial, and motion bio-signals. These microservices can be deployed to a clinical cohort specified through the Clinical Task Manager (CTM). The CTM then collects multi-modal, clinical data that can iteratively improve the accuracy of AI predictive models, discover new disease mechanisms, or identify novel biomarkers. This paper highlights three microservices with different input data types, including a text-based microservice for depression assessment, a video-based microservice for sit-to-stand mobility assessment, and a wearable-based microservice for functional mobility assessment. The CTM is also discussed as a tool to help design and set up clinical studies to unlock the full potential of the platform. Today, the Health Guardian platform is being leveraged in collaboration with research partners to optimize the development of AI models by utilizing a multitude of input sources. This approach streamlines research efforts, enhances efficiency, and facilitates the development and validation of digital health applications.

Index Terms:

Digital Health, Health Guardian Platform, AI/ML Model Development, Microservices, Accelerated Discovery

Refer to caption — Figure 1: Envisioned information flow diagram to create a virtual replica of a person, also known as a healthcare ”digital twin”. Creation of a ”digital twin” uses Internet-of-Things (IoT) technologies to acquire biosignals that are fed into AI-powered models to predict health insights for closed-loop interventions and/or behavioral changes.

I Introduction

Digital Health is a growing interdisciplinary field that has seen a rise in popularity in recent years. The growth of personalized, predictive, and preventative healthcare has been fueled by the widespread use of mobile phones, Internet-of-Things (IoT) devices, and wearable sensors, as well as the affordability of cloud computing services. By incorporating new information technologies like edge computing, cloud computing, and artificial intelligence (AI) into healthcare, traditional practices can be improved, and innovative approaches can be created. For example, real-time monitoring of digital biomarkers using wearable and ambient technologies offers a comprehensive view of a person’s health and are building blocks to creating a healthcare “digital twin”. The data from these digital biosignals can be collected and aggregated, then fed into AI and machine learning models. These models can identify important health insights and early interventions, which can then be shared with the patient and their healthcare team (Figure 1). AI research has already made significant progress in various healthcare domains, such as speech and language analysis for the diagnosis and monitoring of neurodegenerative diseases [1, 2], image analysis for automated detection of diabetic retinopathy [3], and assessments of gait [4], mobility [5], and drawing for evaluating cognitive decline in the elderly [6].

AI models that are trained with data from a multitude of input sources often yield better prediction results and performance. For example, evaluating cognitive decline by combining analysis of drawing and speech can provide a comprehensive understanding of both motor and linguistic aspects of cognition [7]. In diabetes management, sensors that can monitor the quality of sleep, activity levels, and appetite can improve prediction of daily insulin needs [8]. Fragmented remote patient monitoring systems make it difficult to gather data from various sensors with different input data types (e.g. audio, video, text, etc.). Integrating and coordinating data from these sensors and facilitating an easy way to collect patient data to train, validate, and oftentimes retrain, the models require significant coordination and time. To address these challenges, we introduce the Health Guardian (HG) platform, a comprehensive end-to-end solution that enables multi-modal assessments of an individual’s health, and ensures secure data quality control at every stage of the data life cycle.

II Health Guardian Platform

The Health Guardian (HG) platform provides a flexible framework for the rapid translation of AI research into microservices that can be used to collect and manage health-related data from clinical cohorts. Analytics developed using with AI and machine learning (ML) can be converted into deployable microservices using standard HG worker and API-gateway templates, as discussed in [9]. The HG platform also allows users to create customized end-to-end data pipelines that test out the analytics, where data obtained from various microservices can be fed back into the AI predictive models for iterative improvements. A detailed description of the platform’s architecture and design is described in [9].

TABLE I: Summary of microservices supported by Health Guardian

Theme Disease Area Microservice Description Input File (Format) Output Ref Timed Up and Go (TUG) Predict TUG score using data from daily walking activities that are passively captured by a smartwatch Text (.json) TUG Score [10] Sit-to-Stand Analyze sit-to-stand either from a scripted or unscripted activity using an imager (Red-Green-Blue (RGB), depth, or millimeter wave camera) to extract metrics of mobility or motor symptoms and predict scores of standard mobility tests. Video (.mp4) Torso Phase, No. of Hesitations [11] Bradykinesia Simple on-demand test to infer bradykinesia score from wrist worn gyroscope Text (.json) Bradykinesia Score, Pronation-Supination Score [12] Parkinson’s Disease Postural Instability and Gait Disorder (PIGD) Infers PIGD score using lumbar gyroscope/accelerometer from turns during a 1-min walk test. Text (.csv) PIGD Score [12] Amyotrophic Lateral Sclerosis PsychE Acoustics Voice analysis to measure and predict progression of ALS. Audio (.wav) ALS Score, Voice Report [1] PsychE Alzheimer’s Voice analysis to measure and predict likelihood of developing Alzheimer’s Disease. Audio (.wav) Likelihood Score of Developing Alzheimer’s [13] Neuro- Degenerative Disease Alzheimer’s Disease Drawing Analyze freehand drawing with a digitizing tablet and pen to detect Alzheimer’s disease and its prodromal stage (MCI) Image, Text (.zip) Probability of cognitive impairments, estimated cognitive and clinical measures including neuro-pathological changes [6, 14] Mental Health Depression PHQ-8 Depression Questionnaire Provide standard PHQ-8 questionnaire through a mobile phone. Text (.json) PHQ-8 Score [15] Suicidality Suicidality Detection Process the content of speech or input text and analyze to detect signs of depression and suicidality. Text (.json) Audio (.wav) Suicide Probability [16] Wellness and Prevention Mobility Effective Mobility A mobility measure that accounts for different types of activity from walking to moving arms and hands. Text (.json) Effective Mobility Score [17] Sleep Sleep Quality Provide standard Stanford Sleep Questionnaire through a mobile phone. Text (.json) Sleep Questionnaire Score [18] Driving Driving Risk Assessment Extract speech features related to future risks of driving accidents by analyzing conversational speech with an AI chatbot. Text (.json) Audio (.wav) Speech feature scores related to future driving accident risks [19]

The data pipeline consists of five primary stages: data source, data ingestion, data preparation, data access and data analytics (Figure 2). First, data are collected from various data sources such as mobile and IoT devices, wearables, or electronic health records (EHRs). Then, the data are ingested into the clinical task manager (CTM) and routed into appropriate datastores. The data are then prepared using various strategies and steps, including de-identifying protected health information/personally identifiable information (PHI/PII), imputing missing data [20], synchronizing timestamps of data from different sensors, and aggregating data from multiple input streams. The pre-processed data are then accessed by the HG microservice workers to perform downstream analytics via data application programming interfaces (APIs). The HG platform currently supports over 70 capabilities in various focus areas, including neurodegenerative disease, mental health, and wellness and prevention. Finally, the results from the analytics can be viewed through various patient, clinician, or researcher interfaces, such as mobile applications, dashboards and reports.

The HG platform offers a versatile and reusable framework that can support AI/ML-based analytics with various input data types, including audio, text, video, and data from wearable or IoT devices, and can process data using either CPUs or GPUs. Table 1 showcases a subset of the microservices available on the HG platform, categorized by theme and disease areas. These microservices can be employed to manage and maintain individual health. For instance, in the aging population, where frailty, fall risks, and functional and cognitive decline are common, the HG platform can facilitate geriatric assessments and interventions both in-clinic and at-home. By selecting and deploying mobility and cognitive assessment microservices, patients can be evaluated comprehensively in different settings. Furthermore, the HG platform enables the study of Parkinson’s disease patients over time through the deployment of microservices such as Timed Up and Go (TUG), sit-to-stand, bradykinesia, and the Postural Instability and Gait Disorder (PIGD). Additional microservices, such as the PHQ-8 depression questionnaire or suicidality prediction, utilizing audio or text-based inputs, can be added to provide a more comprehensive understanding of a subject’s mobility and mental health status.

The following sections in this paper showcase three distinct microservices deployed on the HG platform. These microservices include a text-based microservice for depression assessment, a video-based microservice for sit-to-stand mobility assessment, and a wearable-based microservice for functional mobility assessment. For each microservice, we provide the clinical background, describe the digital health solution, and outline briefly the analytics used. Each one of these microservices can be deployed as a stand-alone assessment or combined with other microservices to obtain comprehensive insights into an individual’s health. In the discussion section, we delve into the application of the CTM in creating clinical cohorts and facilitating the design and execution of clinical studies involving one or more deployed microservices.

III Integration of Text-based Microservice for Depression Assessment

III-A Clinical Background

Depression is a prevalent mental health condition that affects many individuals worldwide. In 2020, an estimated 21.0 million (8.4%) adults and 4.1 million (17.0%) adolescents aged 12 to 17 in the United States alone, experienced at least one major depressive episode [21]. This condition can impact a person’s feelings, thoughts, and ability to perform daily activities, leading to persistent feelings of sadness, reduced interest in hobbies, and hopelessness lasting more than two weeks. In addition, depression can also cause physical symptoms such as changes in appetite and sleep patterns, fatigue and difficulty concentrating, thus affecting the individual’s capacity to work, sleep, study, and eat.

Validated self-report measures such as the Patient Health Questionnaire (PHQ-8) [15], the Beck Depression Inventory (BDI) [22] and the Depression Anxiety Stress Scales-21 (DASS-21) [23] are commonly used for screening and assessing the severity of depression, guiding recommended treatments, and monitoring symptoms and recovery. However, traditional methods of administering these assessments are limited by the need for in-person, pencil-and-paper administration at a clinic, which can reduce the frequency at which individuals can be screened and monitored.

To address this issue, the IBM Digital Health team has developed a digital, text-based microservice using PHQ-8 depression questionnaire as a framework. The PHQ-8 consists of eight items that align with the diagnostic criteria for major depression disorder in the Diagnostic and Statistical Manual of Mental Disorders Fourth Edition (DSM-IV) [15], which is widely used for depression diagnosis and assessment. This digital microservice can increase access and the frequency at which individuals with mental health disorders can be screened and monitored.

III-B Digital Health Solution

The PHQ-8 questionnaire is delivered to users via the Health Guardian mobile application (Figure 3a.i). Users can select the frequency of each depression symptom they have experienced in the past two weeks, and upon submission, the dataset is parsed into a text-based response in .json format (Figure 3a.ii), and uploaded to the clinical task manager (Figure 3b). The analytic worker processes the structured data, extracts the selected responses from each question (Figure 3c.i), and calculates the total score (Figure 3c.ii). The results are sent back to the mobile application via an API-gateway (Figure 3a.iii), where users can view and track their responses over time. The results are stored as structured data in a datastore, which can be connected to a clinician dashboard or other data rendering interfaces. These tools would enable clinicians or psychiatrists to observe changes in individuals’ symptoms and treatment progression and make informed decisions regarding intervention strategies.

III-C Analytics and Validation

The PHQ-8 questionnaire consists of eight items that evaluate symptoms of depression, with each item scored from 0 points (indicating ”not at all” present) to 3 points (indicating ”nearly every day” present). The total score can range from 0 to 24, with scores categorized as follows: no significant depressive symptoms (0 to 4), mild depressive symptoms (5 to 9), moderate (10 to 14), moderately severe (15 to 19), and severe (20 to 24) [15].

Currently, the microservice provides a score for each completed PHQ-8 questionnaire (Figure 3a.iii), and stores the data longitudinally with time. Leveraging AI/ML models, it is possible to analyze the responses collected from multiple PHQ-8 questionnaires to derive more valuable insights. By considering longitudinal data, a deeper understanding of an individual’s depressive symptoms and their progression can be obtained, enabling personalized and effective interventions.

Clinically, a variant of the PHQ-8, called the PHQ-9 questionnaire is used. The PHQ-8 differs from the PHQ-9 in that the PHQ-8 excludes the last question in the PHQ-9 which asks about thoughts of death and self-harm [15]. The PHQ-8 has shown comparable ability to PHQ-9 in diagnosing and assessing depression among various populations [24, 25, 26]. Since the question about suicidal ideation is concerning when real-time psychiatric intervention or further suicidal evaluation are not provided [25], PHQ-8 may be a better alternative for depression screening when appropriate suicidal evaluation is not provided in the study or research.

IV Integration of Video-based Microservice for Sit-to-Stand Mobility Assessment

IV-A Clinical Background

Advancements in computer vision and deep learning have enabled the development of video-based techniques for contact-free and passive evaluation of Parkinson’s Disease (PD) symptoms during daily activities. PD is the second most common progressive neurodegenerative disease that affects 2-3% of the population over 65 years of age [27]. Its hallmark characteristics are motor symptoms, such as bradykinesia (BRADY), tremors, rigidity, posture instability and gait disorders (PIGD), along with non-motor symptoms such as olfactory dysfunction and sleep disorders. Although no cure exists for PD, dopamine replacement therapy can mitigate symptoms and improve patients’ quality of life.

Managing the various types of PD requires that neurologists comprehend the severity of symptoms and extent of motor fluctuations, which may require changes in medication timing and dosage. In-person assessments at a clinic done once or twice a year, follows protocols specified in the Unified Parkinson’s Disease Rating Scale (UPDRS) and provide neurologists with information on the severity of symptoms, motor fluctuations, and medication efficacy.

In order to gain insights into changes in motor symptoms between clinical assessments, clinicians have asked PD patients to provide daily self-assessments or use wearable sensors to track mobility symptoms. However, recall bias limits the usefulness of self-reports, and compliance adherence issues can affect the deployment of wearable sensors. To address these challenges, IBM’s Digital Health team has developed a video-based method that takes a short 20-30 second video containing sit-stand movements as input to predict the UPDRS subscores, such as BRADY and PIGD [11].

IV-B Digital Health Solution

The sit-to-stand microservice is provided on the Health Guardian mobile application. Using the mobile application, the user is prompted to take a short video (~20-30 s) of themselves cycling from a sitting to a standing position several times (Figures 4a.i and 4a.ii). The user uploads the video in .mp4 format from the mobile application to the clinical task manager (CTM) where the raw data and metadata of the file are stored in a database and cloud object storage (Figure 4b). The video file is processed by the analytic worker for this microservice to calculate the UPDRS scores, and generates a torso motion graph depicting the sit-stand movements and associated hesitations. The UPDRS scores and torso motion graph are sent back to the mobile application via an API-gateway (Figure 4a.iii).

IV-C Analytics and Validation

The analytics for the sit-to-stand microservice involves several steps and is described in detail in [11]. Briefly, a short input video sequence (Figure 4c.i) is processed by a human detector to extract the video frame-by-frame (Figure 4c.ii). The resulting data are then passed to a 2D pose estimation model that predicts the coordinate locations of human joints in 2D image space (Figure 4c.iii). Next, a 3D pose model utilizes the 2D pose information to predict joint locations in 3D Cartesian space (Figure 4c.iv). Finally, the 3D pose information is utilized in three different ensemble combinations that incorporate Hierarchical Convolutional Network (HCN) [28], Spatio-Temporal Graph Convolutional Network (ST-GCN) [29] and/or Convolutional Networks (CNNs) such as ResNet50 [30] (Figure 4c.v). The UPDRS score is predicted from the model, and graphs such as real-time torso motion can be generated from the processed data (Figure 4c.vi).

In two separate clinic visits, video clips of sit-stand motions from 35 subjects were captured and used to validate the analytics of the sit-to-stand microservice. This evaluation was part of a larger UPDRS assessment supervised by a neurologist, who was also assigned to score each task. The study demonstrated that it is possible to predict BRADY and PIGD scores from a short sit-stand video clip, with F1-scores, a measure of a model’s accuracy, from the AI models performing better than the results from the two clinician video raters [11].

V Integration of Wearable-based Microservice for Functional Mobility Assessment

V-A Clinical Background

The Timed Up and Go Test (TUG) is a clinical assessment tool used to evaluate balance and gait in everyday tasks such as sitting, standing, walking and turning. It is commonly used to examine functional mobility in older adults (aged 65+) who may be frail and have a history of falls [31]. The test involves standing up from a chair, walking 3 meters (10 feet), turning around, walking back to the chair, and sitting down again. The time taken to complete the test, measured in seconds, is strongly correlated to the level of functional mobility.

Research has shown that older adults who took 13.5 seconds or longer to perform the TUG were at higher risks of falling, with a positive prediction rate of 90% [32]. Additionally, studies have identified cutoff scores of 11.5 seconds in Parkinson’s disease patients [33], and 14 seconds for patients with strokes [34] as indicating increased fall risks.

The TUG test can also be used to monitor disease progression and changes in the quality of life for patients with mobility impairments as a result of specific diseases, as demonstrated in studies of patients with Parkinson’s disease [35] and patients recovering from hip and knee arthroplasty [36]. Although TUG is a valuable clinical tool, its limitations include the need for in-person assessment and the results are only measured as a snapshot in time.

V-B Digital Health Solution

To adapt the TUG test to a digital health platform, IBM Research scientists have developed an automated TUG prediction AI model that utilizes accelerometer data from a wrist-worn watch. This model was transformed into a microservice using the standard HG worker and API-gateway template [9].

To use this microservice, the subject first connects the HG mobile application to a wearable watch (e.g. TicWatch or Samsung Galaxy Watch series) that has the HG companion application installed. Next, the user selects the TUG microservice on the HG mobile application (Figure 5a.i), which prompts the user to press the ”Start Sensor” button on the watch. The user then walks for at least 30 seconds until the watch buzzes to signal completion. The raw accelerometer signals are encrypted and sent in .json format via Bluetooth to the mobile application, which is then uploaded to the HG backend services for further processing (Figure 5b). The analytic worker processes the accelerometer data and calculates the TUG prediction score as well as the average TUG scores (if multiple walks were recorded in a given day) (Figure 5c). The results are then reported back to the mobile application via the microservice’s API-gateway (Figure 5a.iv).

V-C Analytics and Validation

The analysis framework for the TUG prediction model using wrist-worn accelerators is described in detail in [10]. Briefly, the raw accelerometer data from the wrist-worn accelerator is pre-processed to identify walking episodes and to calculate step duration using step detection (Figure 5c.i). From the step durations and time difference between consecutive step durations, 20 statistical features are extracted for each identified walking episode using a Random Forest model (Figure 5c.ii). The predicted TUG score is derived from these statistical features, with the 25th and 5th percentile of step duration and the mean step duration having the greatest impact on the predicted TUG score.

To validate this model, it was applied to three datasets that contained wearable recordings of walks and TUG scores from 303 subjects, including healthy individuals, those with Parkinson’s disease, and those with mild cognitive impairment or dementia. The two public datasets used were the Long-term Movement Monitoring database (LTMM), which had subjects wear an accelerometer on their lower back, and the Gait in Parkinson’s Disease (GPD) database, which recorded accelerometer data using in-sole gait sensors. The third dataset, the Dementia Behavioral Study dataset (DBSD), was collected by the University of Tsukuba and IBM Research and used wrist-worn accelerometers to record data during in-lab walking. The validation results demonstrated that the Random Forest-based predictive model for TUG had good clinical correlation, and achieved an accuracy of 1.7 +/- 1.7 seconds, with 84.8% of the predictions falling within the minimal detectable change across all three separate cohorts [10].

VI Discussion

In the previous sections, three distinct microservices designed to assess depression, sit-to-stand mobility, and functional mobility using text, video, and wearable data, respectively were highlighted. While stand-alone microservices have clinical utility, recent research suggests that assessing clinical conditions with multitask conditions can lead to even better accuracy and assessments [32]. For instance, a study on older adults with balance issues found that while performing a secondary task, such as a language task, resulted in more swaying, suggesting that the effect of a secondary task on postural control depended on the subject’s balance abilities, the difficulty of the balance task, and the type of secondary task being performed [37].

To support multi-modal microservice deployment on the Health Guardian platform in clinical studies, we have developed a Django-based web portal called the Clinical Task Manager (CTM) to manage study designs and patient cohorts. The web portal offers an easy-to-use graphical user interface for researchers and clinicians with limited programming background to leverage the full utility of the platform.

The CTM supports several key clinical study processes:

•

Adding and managing subjects
•

Defining cohorts and assigning subjects
•

Defining tests and grouping them into test-sets
•

Dynamically defining tasks by assigning test-sets to cohort based on rules
•

Distributing tasks to edge devices for data collection
•

Providing multi-tenancy support by organizing subjects and tasks under separated study projects

The data model of the CTM is shown in Figure 6. To design a clinical study, researchers will first have to identify a patient population of interest, then use the CTM to filter and assign the subjects who meet the study enrollment criteria into a cohort (i.e. a subset of the subjects). Next, the researchers will need to identify one or more microservices that the subjects in the cohort will be asked to perform (i.e. PHQ-8, sit-to-stand, TUG, etc.). All the questions for each microservice is stored as a ‘test-set’ in the CTM. A study ‘task’ is formed when one or more test-sets is mapped to a cohort. Once a task is created, the CTM will distribute the task to all edge devices of subjects in the assigned cohort with the HG mobile application installed. Subjects will be reminded to perform the tasks at the required time (e.g. within a certain time period after a subject wakes up if the task is to collect information about the subject’s sleep quality).

There are several unique features of this data model. One feature is that tasks can be assigned using rule-based dynamics. For example, if the goal of the clinical study is to explore the relationship between depression and mobility, the researcher can set up a rule where each day, a PHQ-8 task is sent to a cohort, and the responses can be filtered to create a sub-cohort with individuals whose score fall below a certain value. A subsequent task like sit-to-stand or TUG can be assigned and sent to this new sub-cohort.

Another feature is that this data model will generate a datapoint for each test completed by a subject. This datapoint can store a value, a string, or a data file along with the associated metadata like timestamps and user account information. These datapoints are grouped into datasets based on the test and test-set relationship. If one or more analytic pipelines are specified already, the system will automatically publish the datasets to the Orbit service’s job queue for processing by the HG back-end components such as the analytic worker and API-gateways. A more detailed description of the Orbit service, and setting up the analytic worker and API-gateways is provided in [9]. After the analytics is completed, the analytic result will be stored into a database for researchers and clinicians to access and review.

The CTM is a critical component of the HG platform that facilitates an end-to-end solution for clinical study design in the digital health domain. By utilizing the CTM, researchers can effortlessly design clinical studies with one or more deployed microservices, and facilitate distributed data collection and processing. The CTM eliminates the common challenges typically associated with establishing and maintaining a data pipeline, enabling researchers to concentrate on data analysis and insight generation. New AI/ML models can be developed to conduct analytics and examine associations between health factors derived from data obtained and stored longitudinally from one or more microservices. The CTM streamlines the research process, empowering researchers to focus on extracting valuable insights from the data.

VII Conclusions

In conclusion, we presented an overview of the Health Guardian platform, a comprehensive solution for collecting multi-modal data to gain insights into individual health. The HG platform simplifies cloud infrastructure and research components, providing standard worker and API-gateway templates to translate AI and ML-based predictive models into microservices. With 70+ capabilities, researchers and clinicians can design their clinical studies using various combinations of microservices such as the PHQ-8, sit-to-stand, and TUG described in this paper. The Clinical Task Manager’s user-friendly graphical user interface allows for easy set up of clinical cohorts, and to select one or more microservices to assign to a specific cohort. The HG platform is flexible, scalable and supports the entire data life cycle, enabling accelerated development of AI research and clinical validation.

Acknowledgment

The authors would like to acknowledge support from the IBM Research Accelerated Discovery Department.

References

[1] R. Norel, M. Pietrowicz, C. Agurto, S. Rishoni, and G. Cecchi, “Detection of amyotrophic lateral sclerosis (als) via acoustic analysis,” Interspeech, pp. 377–381, 2018.
[2] Y. Yamada, K. Shinkawa, and K. Shimmei, “Atypical repetition in daily conversation on different days for detecting alzheimer diease: evaluation of phone-call data from a regular monitoring service,” JMIR Mental Health, vol. 7, no. 1, pp. 1–13, 2020.
[3] V. Gulshan, L. Peng, M. Coram, M. C. Stumpe, D. Wu, A. Narayanaswamy, S. Venugopalan, K. Widner, T. Madams, J. Cuadros, R. Kim, R. Raman, P. Q. Nelson, J. Mega, and D. Webster, “Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs,” JAMA, vol. 316, no. 22, pp. 2402–2410, 2016.
[4] Q. Zou, Y. Wang, Q. Wang, Y. Zhao, and Q. Li, “Deep learning-based gait recognition using smartphones in the wild,” IEEE Transactions on Information Forensics and Security, pp. 1–15, 2020.
[5] Q. Jun, P. Yang, A. Waraich, Z. Deng, Y. Zhao, and Y. Yang, “Examining sensor-based physical activity recognition and monitoring for healthcare using internet of things: A systematic review,” Journal of Biomedical Informatics, pp. 138–153, 2018.
[6] M. Kobayashi, Y. Yamada, K. Shinkawa, M. Nemoto, K. Nemoto, and T. Arai, “Automated early detection of alzheimer’s disease by capturing impairments in multiple cognitive domains with multiple drawing tasks,” Journal of Alzheimer’s Disease, vol. 88, pp. 1075–1089, 2022.
[7] Y. Yamada, K. Shinkawa, M. Kobayashi, V. Caggiano, M. Nemoto, K. Nemoto, and T. Arai, “Combining multimodal behavioral data of gait, speech, and drawing for classification of alzheimer’s disease and mild cognitive impairment,” Journal of Alzheimer’s Disease, vol. 84, pp. 315–327, 2021.
[8] K. Karkkainen, G. Lyng, B. L. Hill, K. Vodrahalli, J. Hertzberg, and E. Halperin, “Sleep and activity prediction for type 2 diabetes management using continuous glucose monitoring,” Workshop on Learning from Time Series for Health, 36th Conference on Neural Information Processing Systems, pp. 1–7, 2022.
[9] B. Wen, V. S. Siu, I. Buleje, K. Y. Hsieh, T. Itoh, L. Zimmerli, N. Hinds, E. Eyigoz, B. Dang, S. von Cavallar, and J. L. Rogers, “Health guardian platform: A technology stack to accelerate discovery in digital health research,” IEEE International Conference in Digital Health (ICDH), pp. 40–46, 2022.
[10] T. Hao, Y. Yamada, J. L. Rogers, K. Shinakwa, M. Nemoto, K. Nemoto, and T. Arai, “An automated digital biomarker of mobility,” IEEE International Conference in Digital Health (ICDH), 2023.
[11] D. Mehta, U. Asif, T. Hao, E. Bilal, S. von Cavallar, S. Harrer, and J. Rogers, “Towards automated and marker-less parkinson disease assessment: Predicting updrs scores using sit-stand videos,” IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 3836–3844, 2021.
[12] V. Anand, E. Bilal, B. Ho, and J. J. Rice, “Towards motor evaluation of parkinson’s disease patients using wearable inertial sensors,” AMIA Annu Symp Proc, pp. 203–212, 2020.
[13] E. Eyigoz, S. Mathur, M. Santamaria, G. Cecchi, and M. Naylor, “Linguistic markers predict onset of alzheimer’s disease,” eClinicalMedicine, vol. 28, pp. 1–9, 2020.
[14] Y. Yamada, M. Kobayashi, K. Shinkawa, M. Nemoto, M. Ota, K. Nemoto, and T. Arai, “Automated analysis of drawing process for detecting prodromal and clinical dementia,” IEEE International Conference on Digital Health, pp. 1–6, 2022.
[15] K. Kroenke, T. W. Strine, R. L. Spitzer, J. B. Williams, J. T. Berry, and A. H. Mokdad, “The phq-8 as a measure of current depression in the general population,” Journal of Affective Disorders, vol. 114, no. 1, pp. 163–173, 2009.
[16] C. Agurto, P. Pataranutaporn, E. K. Eyigoz, G. Stolovitzky, and G. Cecchi, “Predictive linguistic markers of suicidality in poets,” IEEE Internationl Conference on Semantic Computing, pp. 282–285, 2018.
[17] J. Bai, C. Di, L. Xiao, K. R. Evenson, A. Z. LaCroix, C. M. Crainiceanu, and D. M. Buchner, “An activity index for raw accelerometry data and its comparison with other activity metrics,” PLoS ONE, vol. 11, no. 8, pp. 1–14, 2016.
[18] A. Shahid, K. Wilkinson, S. Marcu, and C. M. Shapiro, Stanford Sleepiness Scale (SSS). Springer, New York, NY, 2011.
[19] Y. Yamada, K. Shinkawa, M. Kobayashi, H. Takgai, M. Nemoto, K. Nemoto, and T. Arai, “Using speech data from interactions with a voice assistant to predict the risk of future accidents for older drivers: prospective cohort study,” Journal of medical internet research, vol. 23, no. 4, pp. 315–327, 2021.
[20] P. Schmitt, J. Mandel, and M. Guedj, “A comparison of six methods for missing data imputation,” Journal of Biometrics and Biostatistics, vol. 6, no. 1, pp. 1–6, 2015.
[21] “Major depression.” [Online]. Available: https://www.nimh.nih.gov/health/statistics/major-depression
[22] A. T. Beck, R. A. Steer, G. K. Brown et al., Beck depression inventory. Harcourt Brace Jovanovich New York:, 1987.
[23] J. D. Henry and J. R. Crawford, “The short-form version of the depression anxiety stress scales (dass-21): Construct validity and normative data in a large non-clinical sample,” British Journal of Clinical Psychology, vol. 44, no. 2, pp. 227–239, 2005.
[24] I. Razykov, R. C. Ziegelstein, M. A. Whooley, and B. D. Thombs, “The phq-9 versus the phq-8 — is item 9 useful for assessing suicide risk in coronary artery disease patients? data from the heart and soul study,” Journal of Psychosomatic Research, vol. 73, no. 3, pp. 163–168, 2012.
[25] H. K.-M. Y. H.-K. H. C. Shin Cheolmin, Lee Seung-Hoon, “Comparison of the usefulness of the phq-8 and phq-9 for screening for major depressive disorder: Analysis of psychiatric outpatient data,” Psychiatry Investig, vol. 16, no. 4, pp. 300–305, 2019.
[26] T. S. Wells, J. L. Horton, C. A. LeardMann, I. G. Jacobson, and E. J. Boyko, “A comparison of the prime-md phq-9 and phq-8 in a large military prospective study, the millennium cohort study,” Journal of Affective Disorders, vol. 148, no. 1, pp. 77–83, 2013.
[27] W. Poewe, K. Seppi, C. M. Tanner, G. M. Halliday, P. Brundin, J. Volkmann, A.-E. Schrag, and A. E. Lang, “Parkinson disease,” Nature Reviews Disease Primers, vol. 3, no. 17013, pp. 1–21, March 2017.
[28] C. Li, Q. Zhong, D. Xie, and S. Pu, “Co-occurrence feature learning from skeleton data for action recognition and detection with hierarchical aggregation,” arXiv, 2018.
[29] S. Yan, Y. Xiong, and D. Lin, “Saptial temporal graph convolutional networks for skeleton-based action recognition,” Proceedings of the AAAI conference on artificial intelligence, vol. 32, 2018.
[30] C. Caetano, J. Sena, F. Brémond, J. A. Dos Santos, and W. R. Schwart, “Skelemotion: A new representation of skeleton joint sequences based on motion information for 3d action recognition,” IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp. 1–8, 2019.
[31] T. M. Steffen, T. A. Hacker, and L. Mollinger, “Age- and gender-related test performance in community-dwelling elderly people: Six-minute walk test, berg balance scale, timed up and go test, and gait speeds,” Physical Therapy, vol. 82, no. 2, pp. 128–137, February 2002.
[32] A. Shumway-Cook, S. Brauer, and M. Woollacott, “Predicting the probability for falls in community-dwelling older adults using the timed up go test,” Physical Therapy, vol. 80, no. 9, pp. 896–903, September 2000.
[33] J. R. Nocera, E. L. Stegemöller, I. A. Malaty, M. S. Okun, M. Marsiske, C. J. Hass, and N. P. F. Q. I. I. Investigators, “Using the timed up and go test in a clinical setting to predict falling in parkinson’s disease,” Archives of Physical Medicine and Rehabilitation, vol. 94, no. 7, pp. 1300–1305, July 2013.
[34] A. G. Andersson, K. Kamwendo, A. Seiger, and P. Appelros, “How to identify potential fallers in a stroke unit: validity indexes of 4 test methods,” Journal of Rehabilitation Medicine, vol. 38, no. 3, pp. 186–191, May 2006.
[35] T. Evans, A. Jefferson, M. Byrnes, S. Walters, S. Ghosh, F. L. Mastaglia, B. Power, and R. S. Anderton, “Extended “timed up and go” assessment as a clinical indicator of cognitive state in parkinson’s disease,” Journal of the Neurological Sciences, vol. 375, pp. 86–91, April 2017.
[36] D. M. Kennedy, S. E. Hanna, P. W. Stratford, J. Wessel, and J. D. Gollish, “Preoperative function and gender predict pattern of functional recovery after hip and knee arthroplasty,” Journal of Arthroplasty, vol. 21, no. 4, pp. 559–566, June 2006.
[37] A. Shumway-Cook, M. Woollacott, K. A. Kerns, and M. Baldwin, “The effects of two types of cognitive tasks on postural stability in older adults with and without a history of falls.” The Journals of Gerontology: Series A, vol. 52A, no. 4, pp. M232–M240, July 1997.