Bidirectional GaitNet: A Bidirectional Prediction Model of Human Gait and Anatomical Conditions
Jungnam Park
[email protected]Department of Computer Science and EngineeringSeoul National UniversitySouth Korea, Moon Seok Park
[email protected]Department of Orthopaedic SurgerySeoul National University Bundang HospitalSouth Korea, Jehee Lee
[email protected]NCsoftSouth KoreaDepartment of Computer Science and EngineeringSeoul National UniversitySouth Korea and Jungdam Won
0000-0001-5510-6425[email protected]Department of Computer Science and EngineeringSeoul National UniversitySouth Korea
(2023)
Abstract.
We present a novel generative model, called Bidirectional GaitNet, that learns the relationship between human anatomy and its gait. The simulation model of human anatomy is a comprehensive, full-body, simulation-ready, musculoskeletal model with 304 Hill-type musculotendon units. The Bidirectional GaitNet consists of forward and backward models. The forward model predicts a gait pattern of a person with specific physical conditions, while the backward model estimates the physical conditions of a person when his/her gait pattern is provided. Our simulation-based approach first learns the forward model by distilling the simulation data generated by a state-of-the-art predictive gait simulator and then constructs a Variational Autoencoder (VAE) with the learned forward model as its decoder. Once it is learned its encoder serves as the backward model. We demonstrate our model on a variety of healthy/impaired gaits and validate it in comparison with physical examination data of real patients.
††submissionid: 271††journalyear: 2023††copyright: acmlicensed††conference: Special Interest Group on Computer Graphics and Interactive Techniques Conference Conference Proceedings; August 6–10, 2023; Los Angeles, CA, USA††booktitle: Special Interest Group on Computer Graphics and Interactive Techniques Conference Conference Proceedings (SIGGRAPH ’23 Conference Proceedings), August 6–10, 2023, Los Angeles, CA, USA††price: 15.00††doi: 10.1145/3588432.3591492††isbn: 979-8-4007-0159-7/23/08††ccs: Computing methodologies Physical simulation††ccs: Computing methodologies Motion capture††ccs: Computing methodologies Reinforcement learning††ccs: Computing methodologies Learning from demonstrations
1. Introduction
Realistic simulation of human movement is one of the long-standing challenges in computer graphics. Musculoskeletal simulation has provided stepping stones to reproduce a range of human movements at the biomechanical and anatomical levels, adding significant realism to the movements created. It can also be used to predict how changes in anatomical conditions (e.g., bone deformity, muscle capacity/deficiency, mass distribution) and intrinsic/extrinsic factors (e.g., metabolic energy expenditure, fatigue, pain) affect human movement.
Accurate setting of anatomical conditions is the very first and fundamental process for generating realistic movements in musculoskeletal simulation as it creates the space for anatomically plausible human movements. A number of biomechanical studies have been conducted to accurately model these conditions through various experiments on human subjects and cadavers (Arnold et al., 2010; Rajagopal et al., 2016; Delp et al., 1990, 2007; Carbone et al., 2015). Several standard models for the typical/average human body have been adopted by the research community (Delp et al., 1990, 2007) and those average body models have significantly contributed to recent progress in biomechanics research (Dembia et al., 2020; Lee et al., 2019; Park et al., 2022). Building an accurate musculoskeletal model of a specific (healthy or impaired) person has long been a notoriously difficult challenge since accurate modeling of live organs and tissues often requires invasive measurements.
In this paper, we address the problem of estimating the physiological parameters of a musculoskeletal model from observed gait cycles. Our anatomy model describes the conditions of individual bones and muscles with 300+ parameters. Conceptually speaking, we want to uncover the relationship between human anatomy and gait. Although this relationship is well accepted empirically in biomechanics and clinical gait analysis, the relationship is probabilistic rather than a one-to-one deterministic mapping. For example, many people with different physical conditions can walk with a similar gait, and conversely, there is no guarantee that two people with similar physical conditions will walk with a similar gait. In this paper, we build a novel generative model, which we call Bidirectional GaitNet, based on a comprehensive full-body, simulation-ready, musculoskeletal model. The Bidirectional GaitNet consists of forward and backward models. The forward model is functionally equivalent to predictive gait simulation, which generates a bipedal gait for any anatomical model with a specific set of physical conditions. Conversely, the backward model is its inverse process that estimates the physical conditions of the model given a gait. More specifically, we first learn the forward model by distilling the simulation data generated by a state-of-the-art predictive gait simulator and then construct a conditional Variational Autoencoder (c-VAE) with the forward model as its decoder. Once it is learned, its encoder serves as the backward model. By the nature of VAE, our backward model generates a distribution of physical conditions that potentially produce the input gait in predictive gait simulation, from which many different physical conditions can be sampled.
We demonstrate the power of our model by showing results with a variety of healthy/impaired gaits. The simulated results are validated in comparison to not only unseen simulated gaits but also gaits from real patients. The effectiveness of the non-trivial system design choices that we made for developing our model are also validated by ablation studies. Code for this paper is available at https://github.com/namjohn10/BidirectionalGaitNet.
2. Related work
In musculoskeletal simulation, the human body is typically modeled by rigid bones and flexible musculotendon units. The bones are connected by rotational joints to which the musculotendon units are attached such that they can actuate the joints. The muscle dynamics is often formulated using Hill-type muscles (Zajac, 1989; Delp et al., 2007) composed of contractile and elastic elements and the dynamics of each element is determined by force-length and force-velocity curves. Accurate simulation requires adequate determination of all anatomical conditions of the bones and the muscles based on reliable measurements. A set of parameters for a typical/average human have been estimated by taking measurements from live tissues and cadavers (Delp et al., 1990; Rajagopal et al., 2016; Arnold et al., 2010; Carbone et al., 2015). Estimates thus obtained have been used as default parameters in many simulation-based studies (Delp et al., 2007; Lee et al., 2019; Dembia et al., 2020).
Anthropometric scaling of a musculoskeletal model can generate a range of models with different heights, weights, and limb lengths (Ryu et al., 2021). Building accurate individualized models often requires expensive medical images (e.g., CT and MRI) and labor-intensive image labeling (Matias et al., 2009; Levin et al., 2011; Li et al., 2022).
Building robust dynamic controllers that can drive musculoskeletal models has been regarded as an open challenge for decades because musculoskeletal models are high-dimensional and highly nonlinear, and their control systems are often partly under-actuated and, at the same time, partly over-determined. In this paper, we will focus on legged locomotion, although there have been a series of studies for simulating other body parts, such as upper bodies (Lee et al., 2009, 2018), eyes (Nakada et al., 2018), faces (Ichim et al., 2017; Cong et al., 2016), hands (Sachdeva et al., 2015) and swimming (Si et al., 2014).
Biomechanics researchers have developed locomotion controllers to answer research questions, such as how weakening certain muscles affect gait patterns. Open-loop (Sok et al., 2007; Liu et al., 2010; Al Borno et al., 2013; Falisse et al., 2019), model-based feedback control (Geyer and Herr, 2010; Yin et al., 2007; Lee et al., 2010; Ye and Liu, 2010; Coros et al., 2010; Ha et al., 2012; Liu et al., 2016; Song and Geyer, 2015; Ong et al., 2019), and their combinations (Mordatch and Todorov, 2014) have been explored where the controller design is often motivated by structural and functional hypotheses on nervous and motor control systems. On the other hand, computer animation researchers have focused on allowing physically-simulated characters to move lifelikely as humans by adding actuation constraints imposed by muscle dynamics. Wang et al. (2012) developed walking controllers for 3D biped characters equipped with 8 hill-type muscles per leg with the muscle-reflex model proposed by Geyer and Herr (2010). Stochastic optimization is used to determine feedback control parameters such that biped characters can maintain their balance while walking. Geijtenbeek et al. (2013) applied a similar approach to non-human musculoskeletal characters, where manually-specified initial muscle routings are further optimized to improve the motor skills of the characters. Lee et al. (2014) demonstrated interactive controllers for full-body musculoskeletal characters having more than 100 muscles, where it first runs offline optimization to refine the reference motion and then the online controller based on quadratic programming tracks the refined reference motion at runtime.
Recently, deep reinforcement learning (DRL) has successfully demonstrated its capabilities in solving high-dimensional, continuous control problems including human motion imitation (Yu et al., 2019; Peng et al., 2018; Bergamin et al., 2019; Park et al., 2019; Won et al., 2020; Peng et al., 2021; Merel et al., 2019; Peng et al., 2022; Won et al., 2022), motion control in complex
environments (Clegg et al., 2018; Liu and Hodgins, 2018; Won et al., 2021; Yang et al., 2022; Ye et al., 2022; Winkler et al., 2022) and non-human character control (Yu et al., 2018; Luo et al., 2020; Lee et al., 2022; Ishiwaka et al., 2022). The control of musculoskeletal characters is no exception for these technological innovations; in particular, controllers based on DRL have been significantly improved in terms of robustness against external perturbation, computational efficiency at runtime, and the scope of reproducible motor skills. Many simulation algorithms for controlling biped musculoskeletal models competed in the Learn-to-Move challenges (Kidziński et al., 2020) and DRL-based algorithms performed well. The winner of the final challenge proposed a model-based DRL algorithm equipped with an ensemble of probabilistic dynamics models and a risk minimization scheme by expanding the lower confidence bound of the value estimation (Kidziński et al., 2018). Lee et al. (2019) developed a two-level control architecture composed of a trajectory mimicking network and a muscle coordination network, by which comprehensive musculoskeletal characters with more than 300 muscles successfully reproduced highly dynamic human movements such as running, jumping, and cartwheel. Yifeng et al. (2019) proposed a new action space for DRL that mimics the behaviors of musculoskeletal simulation computationally more efficiently with torque-based simulation. Park et al. (2022) presented a predictive gait simulation framework Generative GaitNet, which can predict a broad spectrum of healthy and pathological gait of comprehensive musculoskeletal models over a high-dimensional parameter space spanned by anatomical (e.g., bone/muscle parameters, mass distribution, muscle capacity) and gait (e.g., stride and cadence) conditions.
3. human anatomy and gait
The musculoskeletal model in this work is designed to represent healthy and pathological gaits often discussed in clinical gait analysis and medical engineering. Our model consists of 23 bones connected by 22 skeletal joints and 304 musculotendons. The activation of muscles generates contraction forces that drive skeletal joints. The individual bones and muscles are conditioned by anatomical parameters. Let be the anatomical condition, where represents the scaling factors of the head, the trunk, the lengths of four (upper and lower) limbs with respect to a reference (average) model of healthy adults. and are the torsional angles of femurs, which correspond to femoral anteversion and retroversion often observed in pathological gait. Each musculotendon is conditioned by two parameters: weakness and contracture. The weakness parameter of a muscle indicates the ratio of maximum isometric force the muscle can exert relative to the reference model. The higher the parameter, the larger the force the muscle can produce. The contracture parameter indicates the scaling factor of the muscle length relative to the reference model, which refers to the permanent shortening of a muscle that often limits the range of joint movements.
Conventionally, a complete two-step cycle of gait begins with a left heel strike, where both feet are in contact with the ground at the same time, and ends at the next left heel strike. To avoid the singularity at both ends of the gait cycle, two gait cycles are considered as basic units of gait in our system. Therefore, the gait pattern is represented by a time series of full-body poses , where each pose denotes the root (pelvis) height from the ground, the root linear velocity parallel to the ground plane, and joint rotations with respect to their parent joints. denotes the global orientation of the root body node. We use the first two columns of the rotation matrix to describe the rotation. The sample rate is 60 per two gait cycles in our implementation. The gait is conditioned by two parameters: stride and cadence.
Figure 1. Predictive gait simulation and intelligent gait analysis.
The predictive gait simulation generates a gait pattern that likely occurs given anatomical and gait conditions (see Figure 1). Conversely, gait analysis is its inverse process of predicting the anatomical and gait conditions of a given gait pattern. The human musculoskeletal model is a dynamical system that is partly under-actuated because the root node is not actuated and, at the same time, partly over-actuated because the body has more muscles than minimally required to drive the skeletal joints. This redundancy makes the whole process probabilistic.
The predictive power of gait simulation stems from making full use of the laws of physics, providing accurate anatomical modeling, and designing reliable control policies for seemingly unstable biped locomotion. Recent approaches successfully derived robust control policies through deep reinforcement learning (Lee et al., 2019; Kidziński et al., 2018). In this case, the core of a predictive gait simulation is a policy network, which takes the current state of the musculoskeletal model as input and outputs the desired level of activation at all muscles. The state-of-the-art method learned control policies conditioned by high-dimensional conditioning vectors (more than 600 parameters) (Park et al., 2022). The musculoskeletal model with specific physical conditions driven by a policy network generates a trace of full-body poses in physics-based simulation.
In this paper, the key challenge is to find the inverse process of predictive simulation. Given a gait pattern, the goal of gait analysis is to find corresponding physical conditions. There are three issues to be addressed. First, the search space is high-dimensional. In our system, conditioning vectors are 280 dimensional. State space search or optimization in such a high-dimensional space is often computationally intractable. Secondly, the solution is not unique. As discussed before, the backward mapping is probabilistic, and thus we need to find a probabilistic distribution over the conditioning space rather than a single optimal solution. Lastly, the computational cost should be reasonable. Rolling out a single gait pattern from a predictive simulator requires the computational cost of physics-based simulation over a complete gait cycle, which is substantial with a stochastic sampling of gait patterns.
We address these issues by pretraining forward and backward networks. We transfer control policies from policy networks to a forward network representing a direct anatomy-to-gait mapping. The transfer process is similar to policy distillation. It takes samples from the policy network and learns the target network using supervised regression. The pretrained forward network has two advantages. The forward network generates physically-valid gait patterns without actually performing physics-based simulations because it imitates the behavior of the policy network. This direct anatomy-to-gait mapping is computationally more efficient at runtime than predictive gait simulation. This computational efficiency also makes it computationally feasible to pretrain the backward network employing a conditional Variational AutoEncoder (c-VAE) that models the probabilistic mapping between anatomy and gait using Gaussian distributions in latent space.
4. Bidirectional GaitNet
Let and be anatomical and gait conditions, respectively.
The gait pattern includes two gait cycles, which are parameterized by phase .
Our Bidirectional GaitNet learns the relationship between anatomy and gait by conditional forward mapping and backward mapping . This formulation has several issues in designing network models. The size of the forward network depends on the frame rate and can be larger than actually needed. Given a gait pattern, its stride, cadence, body proportions, and limb lengths are readily available in the motion capture process or can be directly computed from the gait data. To address the issues, we simplify the forward and backward mappings, respectively, by and where we added the phase in the input layer of the forward network and reduced the output layer such that it generates a full-body pose at phase instead of a full gait pattern. This design decision significantly reduces the complexity of the prediction and allows us to achieve improved accuracy with smaller regression networks. The backward network also becomes smaller by removing the skeleton and gait conditions from the output layer.
4.1. Forward GaitNet
The Forward GaitNet is a regression network learned in a supervised manner with training data generated by a predictive gait simulator. In our work, we use Generative GaitNet (Park et al., 2022) to generate a collection of condition-gait tuples randomly sampled over the domain of anatomy and gait conditions. The loss function for regression is
(1)
where is the output of the regression network. measures the difference between two full-body poses
(2)
where it computes the weighted sum of the differences in root heights, root velocities, and joint rotations. and are the weights of their corresponding terms and is the difference between two rotation matrices.
Figure 2. The structure of Bidirectional GaitNet.
4.2. Backward GaitNet
The Forward GaitNet and the Backward GaitNet constitute a conditional Variational AutoEncoder (c-VAE) that takes a gait pattern as input (see Figure 2). Once the Forward GaitNet is learned through regression, we learn the Backward GaitNet while having the parameters of the Forward GaitNet fixed. The loss function is
(3)
where is a gait pattern reconstructed through the autoencoder.
(4)
Note that here is a gait pattern generated by concatenating the output of the phase-wise forward network over two gait cycles. measures KL-divergence, is the -th element of the predicted muscle conditions , and is the difference between two gait patterns. The first term is the reconstruction loss between input and output gaits, and the second term is the typical kl divergence loss for VAE learning, where we use the standard normal distribution as our prior. The last term regularizes the predicted conditions to the reference conditions (i.e., typical healthy adults).
Note that the probability distribution is located in the middle of the backward model rather than its output so that muscle conditions are encoded in the low-dimensional latent space, where the latent codes are then decoded to muscle conditions via the pre-decoder (see Figure 2).
This structure has several advantages over modeling muscle conditions directly as stochastic latent variables. First, many muscles are functionally correlated, meaning that muscle coordination can have lower intrinsic dimensions and potentially be modeled efficiently in low-dimensional spaces. Second, the multi-modality of muscle conditions can be better preserved. In c-VAE, the latent space is often modeled as a uni-modal probability distribution, which is not ideal for our case because there exist disparate muscle conditions that generate almost the same gait. The pre-decoder structure enables us to map a simple uni-modal distribution to a complex multi-model distribution.
4.3. Learning
4.3.1. Selection of Muscle Parameters
The musculoskeletal model we use has 604 muscle conditions in total which cover the upper and lower bodies. We observed that prediction of muscle conditions for the upper body (especially for the arms) could harm the overall performance because the upper body motion is loosely correlated with their muscle conditions due to higher freedom in motion when compared to the lower body. Intuitively speaking, people could perform almost any motion while walking, which would not be true for the lower body. To mitigate this problem, we only include muscle conditions for the lower body as well as a few conditions of the muscles attached to the hip (e.g., iliopsoas), of which size is 280 in total.
4.3.2. Mixture of Multiple Backward GaitNet
The space of muscle conditions covered by our musculoskeletal model is highly diverse, and we also aim to infer extreme muscle conditions occurred in many patients. To better capture such high variations, we learn multiple Backward GaitNets, then choose one of their output by evaluating the results qualitatively and quantitatively (if possible). This strategy is similar to mixture-of-experts (Masoudnia and Ebrahimpour, 2014) where each expert covers a subspace of the entire space. Furthermore, this process is analogous to diagnosing a single patient by several doctors who are good at diagnosing a specific pathology. In this work, we use three different models. One is trained only for the muscle conditions of knees and ankles, the other two are trained for the entire muscle with different weights in the Equation 3.
4.3.3. Data Sampling
In this work, we are dealing with very high-dimensional space to explore, spanned by 280 conditioning variables. Each conditioning variable has its valid range specified by the user. In order to examine only the corners of this high-dimensional domain, we need to sample a tremendously large number of condition-gait tuples, which is computationally intractable. Surprisingly, the training of both Forward and Backward GaitNet required a much smaller collection (1.7 million) of condition-gait tuples in our experiments. Both networks generalize pretty well to learn the influence of physical conditions on gait from very sparse samples. Admitting that sparse sampling is inevitable, uniformly random sampling in the high-dimensional space is not ideal for exploring large variations in physical conditions. We use an alternative approach called grid-based sampling, which selects samples only at corner points, where all conditions can have either minimum or maximum value in their valid range. In our experiments, grid-based sampling outperforms uniform random sampling in particular for reproducing severely impaired gaits. The rationale for our grid-based sampling is that changes in motion and muscle conditions often have monotonically increasing/decreasing relationship. For example, interpolation of the two different pathological gaits, where one is generated by lengthening a specific muscle and the other is vice-versa, can often produce a gait that can be generated when the muscle is in its normal condition. This implies that having the model experience the extreme input conditions by combining those extremities would be more efficient than uniform sampling in the specified range.
5. Results
To train Bidirectional GaitNet, we collected approximately 500 hours long walking motions from our predictive gait simulator. This data generation takes approximately 10 hours in the cluster machine equipped with 20 Intel Xeon 6242 CPUs. We use Torch (Paszke et al., 2019) to implement our models. The encoder, the pre-decoder, and Forward GaitNet are modeled by feed-forward networks with [256, 256, 256], [256, 256, 256], and [512, 512, 512] layers, respectively. For the activation units, LeakyReLu/Linear, LeakyReLu/Sigmoid, and ReLu/Linear are used for the hidden/output layers.
Both models are learned by following the standard procedure of supervised learning with 65536 (forward), 2048 (backward) batch sizes, Adam optimizer with learning rate. The training takes 2, 1.5 hours for Forward GaitNet, and Backward GaitNet, respectively, by using the desktop equipped with Ryzen 3950, NVIDIA 2070, and 64GB ram. The dimension of , , , and are 6060, 32, 13, and 268, respectively.
5.1. Evaluation on Unseen Simulated Data
Our Bidirectional GaitNet is trained by using the dataset generated by a predictive gait simulator. We first evaluate that our model is generalizable to 51 unseen simulated data points separated out from the training data, which includes normal, fully random, and pathological cases. Out of 51 data points, 7 data points are created by running Generative GaitNet with the similar muscle conditions demonstrated in the previous work [Park et al. 2022] to cover well-known gait patterns (normal, foot drop, equinus, stiff knee, crouch, trendelenburg, and waddling gait) while the remaining data points are created by randomly sampling anatomical conditions within the valid range. Note that they are generated independently, so exactly the same data are never included in the training dataset.
5.1.1. Prediction of Gaits
We ultimately should be able to predict anatomical conditions that are realizable similarly to the input gait in the predictive gait simulator. So, we measure error between the ground truth gait and the simulated gait with the predicted anatomical conditions from Backward GaitNet, where the 3rd row in Table 1 shows the average and variance of joint angle prediction error measured over 2 gait cycles. On average, the angle difference is less than 8 degrees. This means that our model is able to generalize to unseen simulated data successfully. Furthermore, the amount of error occurred by our model is almost visually indistinguishable as demonstrated in both Figure 6 and the supplemental video.
Table 1. Joint Angle Prediction Error.
Forward
GaitNet
BackWard
GaitNet
Pelvis
FemurR
TibiaR
TalusR
FootPinkyR
FootThumbR
FemurL
TibiaL
TalusL
FootPinkyL
FootThumbL
Uniform
Uniform
8.87398
(2.27329)
11.9979
(1.97183)
5.53967
(6.22365)
14.874
(7.87755)
8.38405
(8.08005)
8.89714
(8.29367)
12.2464
(7.78203)
5.33964
(2.2433)
14.6996
(2.27785)
7.49264
(8.47628)
7.35133
(12.6308)
Uniform
Grid
8.77184
(4.27071)
11.8603
(0.746043)
5.47452
(5.75071)
15.085
(4.78839)
8.38307
(8.30824)
8.45141
(8.17749)
12.2202
(5.44729)
5.3674
(5.05303)
14.6827
(7.69895)
7.29844
(12.5078)
7.00184
(1.70778)
Grid
Grid
8.47702
(2.15696)
11.8692
(1.15372)
5.79214
(11.8167)
14.7716
(2.1861)
7.62802
(6.71003)
7.67742
(7.66728)
11.1595
(1.70138)
5.4935
(1.85809)
14.5509
(1.28581)
8.45036
(20.0347)
8.14378
(22.3314)
Spine
Torso
Neck
Head
ShoulderR
ArmR
ForeArmR
HandR
ShoulderL
ArmL
ForeArmL
HandL
8.59933
(1.50475)
8.22234
(1.96971)
7.4927
(4.40804)
6.8291
(4.05152)
1.80658
(0.316834)
11.5427
(6.8967)
3.24304
(0.75264)
9.78388
(7.1614)
1.98267
(1.95443)
10.9824
(1.56775)
3.00934
(2.98694)
8.65188
(8.31125)
8.67501
(3.51157)
8.27154
(0.26088)
7.57086
(4.38966)
6.83435
(4.60315)
1.79577
(0.985541)
11.52
(4.41658)
3.25868
(0.643535)
9.98208
(7.61925)
1.99904
(1.85795)
10.9108
(0.524045)
3.09474
(3.09089)
8.7093
(5.95637)
8.28418
(4.57552)
8.14461
(0.243296)
7.24122
(0.91435)
6.87393
(3.44239)
1.77189
(0.147876)
10.8449
(1.71357)
2.98214
(1.62909)
10.2911
(6.87899)
1.92183
(1.64011)
10.5057
(1.13507)
2.74626
(2.73175)
8.04477
(6.98929)
5.1.2. Prediction of Muscle Conditions
Figure 4 shows muscle conditions predicted by our Backward GaitNet given a typical pathological gait, hip-drop (trendelenburg), which result from defective muscles at one side of hip. Physiologically, such dropping motion could appear either when a muscle at one side has contracture or when the other side has weakness. Our results show that both physiologically plausible conditions can be discovered successfully by our backward model. In the supplemental video, we also show a variety of predicted muscle conditions and their corresponding simulated gait.
We also compare the ground truth muscle conditions with the estimated muscle conditions. In contrast to gait prediction error, where we can compute joint angle difference directly, the ground truth and estimated muscle conditions are not directly comparable because multiple conditions that generate the same gait pattern could exist. Instead, given an input gait paired with the ground truth muscle conditions, we evaluate our backward model by testing whether the probability distribution predicted by our model from the input gait includes the ground truth conditions or not. We first create 1000 sets of muscle conditions from our backward model by randomly sampling from the distribution, then we run a dimensionality technique for those conditions in addition to the ground truth conditions. Figure 3(a) shows 2D embedding drawn by Umap (McInnes et al., 2018), from which we could infer that the ground truth conditions are observable with high probability under the predicted distribution.
Figure 3. 2D Embeddings of predicted and ground-truth muscle conditions over 4 different input gaits. The grey and red dots represent predicted and ground-truth conditions, respectively, the ground truth conditions are observable with high probability under the distribution of predicted muscle conditions.Figure 4. Different muscle conditions given an input gait (a.k.a. trendelenburg).
5.2. Evaluation on Gaits of Real Patients
We show further generalization capability by evaluating our model for gaits of real patients. Unlike the evaluation on simulated data, the ground truth muscle conditions do not exist for real patients because examining such conditions for every muscle of a patient is not realizable rather examining itself is a very challenging problem. We evaluate our results qualitatively by comparing the input gaits from real patients and the simulated gaits under the anatomical conditions predicted by our backward model (see Figure 6 and our supplemental video). Even if we trained our model with the simulation data only, our model still can produce plausible predictions for the gaits from real patients which might differ from simulation results due to Sim-to-Real gap.
We also evaluate the predicted muscle conditions by using Physical Exam (Moon et al., 2017), which has been widely used in medicine for testing muscle conditions of patients. The exam measures range-of-motions of several joints at pre-specified postures to examine that muscle lengths are either shortened or lengthened permanently, which significantly could affect functionality in locomotion if they differ from the normal conditions. We implemented the exam, where the maximum range-of-motion for a joint is determined by measuring the magnitude of joint torque accumulated from passive forces of muscle relevant to the joint while performing the exam, where we regard it as the maximum value if the magnitude is larger than 20 Nm. Figure 5 compares range-of-motions of the left and the right ankles given a patient’s gait, where our system predicts that the reason of asymmetry of the input gait is because the right calf muscles are weaker than the left one. This study is approved by The Institutional Review Board of Seoul National University Hospital (B-1107-132-101).
Figure 5. Physical Exam on Ankles
5.3. Comparison
To show the effectiveness of the technical components adopted in our method, we conduct ablation studies on the three components that we think are crucial.
5.3.1. Grid vs. Uniform Sampling
We show the effectiveness of grid-based random sampling which we used when generating the training data, by comparing it with another model trained with data generated by the uniform sampling. We run an evaluation with the same unseen simulated gaits used in Section 5.1. Table 1 compares joint angle prediction errors over three different design choices, Uniform-Uniform, Uniform-Grid, and Grid-Grid (ours), where the errors are lower in general when our grid-based sampling is used for learning both the forward and the backward models.
Figure 6. A comparison of input gaits (green) and their simulated gaits (white) with predicted anatomical conditions from our Backward GaitNet.
6. Discussion
We developed a novel generative model, called Bidirectional GaitNet, that learns the relationship between human anatomy and its gait. It consists of forward and backward models where the forward model predicts a gait pattern of a person with specific physical conditions, while the backward model estimates the physical conditions of a person when his/her gait pattern is provided. By constructing a c-VAE structure with the forward model learned by simulation data generated from the state-of-the-art predictive gait simulator, we were able to learn its inverse mapping (i.e., the backward model) effectively. We showed that the anatomical conditions predicted by our model were able to be realized for both unseen simulated gaits and real patients’ gaits.
There are still several limitations in our method. First, although joint angle prediction error is pretty accurate (less than 8 degrees on average), it needs to be further improved for some joints such as talus (ankle) and femur (upper leg) which show larger errors when compared to other joints. Note that those are actually the joints that have much wider variations in our dataset than others. The sophisticated network architectures (e.g., Transformer) or loss functions that can better capture wider variations in data would be helpful in general. Second, our method can be easily extended so that it includes the upper body muscle conditions. However, the prediction of those muscle conditions might be less reliable when compared to ones in the lower body due to weaker dependency between gaits and their muscle conditions. Simply speaking, the upper body motions have much larger freedom and could be almost arbitrary. In addition, some of anatomical features such as the nervous system, flexible tendon model, and skin are missing in the musculoskeletal model we used in this research, which might affect the prediction of gait and physical condition differently.
We envision several promising future directions that we want to study further. The input of the current system should be given as mocap data, which might be cumbersome and expensive to obtain. It would be interesting if we can use a monocular video as input to our system, which could be done by using existing video-to-mocap solutions or other training schemes. Another interesting direction would be exploring more complex full-body or facial musculoskeletal models that include volumetric muscles, which will have a practical impact on animation industries because those models have often been used in commercial films.
Acknowledgements.
This study was supported by the New Faculty Startup Fund from Seoul National University, ICT(Institute of Computer Technology) at Seoul National University, and grant no (14-2020-0012) from the SNUBH Research Fund.
References
(1)
Al Borno et al. (2013)
Mazen Al Borno, Martin de
Lasa, and Aaron Hertzmann.
2013.
Trajectory Optimization for Full-Body Movements
with Complex Contacts.
IEEE Transactions on Visualization and
Computer Graphics 19, 8
(2013), 1405–1414.
Arnold et al. (2010)
Edith M Arnold, Samuel R
Ward, Richard L Lieber, and Scott L
Delp. 2010.
A model of the lower limb for analysis of human
movement.
Annals of biomedical engineering
38, 2 (2010),
269–279.
Bergamin et al. (2019)
Kevin Bergamin, Simon
Clavet, Daniel Holden, and
James Richard Forbes. 2019.
DReCon: data-driven responsive control of
physics-based characters.
ACM Transactions On Graphics
38, 6 (2019).
Carbone et al. (2015)
Vincenzo Carbone, René
Fluit, Pim Pellikaan, MM Van Der Krogt,
D Janssen, M Damsgaard,
L Vigneron, T Feilkas,
Hubertus FJM Koopman, and N
Verdonschot. 2015.
TLEM 2.0–A comprehensive musculoskeletal geometry
dataset for subject-specific modeling of lower extremity.
Journal of biomechanics
48, 5 (2015),
734–741.
Clegg et al. (2018)
Alexander Clegg, Wenhao
Yu, Jie Tan, C Karen Liu, and
Greg Turk. 2018.
Learning to dress: Synthesizing human dressing
motion via deep reinforcement learning.
ACM Transactions on Graphics
37, 6 (2018).
Cong et al. (2016)
Matthew Cong, Kiran S.
Bhat, and Ronald Fedkiw.
2016.
Art-Directed Muscle Simulation for High-End Facial
Animation. In Proceedings of the ACM
SIGGRAPH/Eurographics Symposium on Computer Animation.
119–127.
Coros et al. (2010)
Stelian Coros, Philippe
Beaudoin, and Michiel Van de Panne.
2010.
Generalized biped walking control.
ACM Transactions On Graphics
29, 4 (2010).
Delp et al. (2007)
Scott L Delp, Frank C
Anderson, Allison S Arnold, Peter Loan,
Ayman Habib, Chand T John,
Eran Guendelman, and Darryl G Thelen.
2007.
OpenSim: open-source software to create and analyze
dynamic simulations of movement.
IEEE transactions on biomedical engineering
54, 11 (2007),
1940–1950.
Delp et al. (1990)
Scott L Delp, J Peter
Loan, Melissa G Hoy, Felix E Zajac,
Eric L Topp, and Joseph M Rosen.
1990.
An interactive graphics-based model of the lower
extremity to study orthopaedic surgical procedures.
IEEE Transactions on Biomedical engineering
37, 8 (1990),
757–767.
Dembia et al. (2020)
Christopher L Dembia,
Nicholas A Bianco, Antoine Falisse,
Jennifer L Hicks, and Scott L Delp.
2020.
Opensim moco: musculoskeletal optimal control.
PLOS Computational Biology
16, 12 (2020).
Falisse et al. (2019)
Antoine Falisse, Gil
Serrancolí, Christopher L Dembia,
Joris Gillis, Ilse Jonkers, and
Friedl De Groote. 2019.
Rapid predictive simulations with complex
musculoskeletal models suggest that diverse healthy and pathological human
gaits can emerge from similar control strategies.
Journal of the Royal Society Interface
16, 157 (2019).
Geijtenbeek et al. (2013)
Thomas Geijtenbeek,
Michiel Van De Panne, and A Frank Van
Der Stappen. 2013.
Flexible muscle-based locomotion for bipedal
creatures.
ACM Transactions on Graphics
32, 6 (2013).
Geyer and Herr (2010)
Hartmut Geyer and Hugh
Herr. 2010.
A muscle-reflex model that encodes principles of
legged mechanics produces human walking dynamics and muscle activities.
IEEE Transactions on neural systems and
rehabilitation engineering 18, 3
(2010), 263–273.
Ha et al. (2012)
Sehoon Ha, Yuting Ye,
and C Karen Liu. 2012.
Falling and landing motion control for character
animation.
ACM Transactions on Graphics
31, 6 (2012).
Ichim et al. (2017)
Alexandru-Eugen Ichim,
Petr Kadleček, Ladislav Kavan,
and Mark Pauly. 2017.
Phace: Physics-based face modeling and animation.
ACM Transactions on Graphics
36, 4 (2017).
Ishiwaka et al. (2022)
Yuko Ishiwaka, Xiao S
Zeng, Shun Ogawa, Donovan Michael
Westwater, Tadayuki Tone, and Masaki
Nakada. 2022.
DeepFoids: Adaptive Bio-Inspired Fish Simulation
with Deep Reinforcement Learning. In Advances in
Neural Information Processing Systems.
Jiang et al. (2019)
Yifeng Jiang, Tom
Van Wouwe, Friedl De Groote, and
C Karen Liu. 2019.
Synthesis of biologically realistic human motion
using joint torque actuation.
ACM Transactions On Graphics
38, 4 (2019).
Kidziński et al. (2018)
Łukasz Kidziński,
Sharada Prasanna Mohanty, Carmichael F
Ong, Zhewei Huang, Shuchang Zhou,
Anton Pechenko, Adam Stelmaszczyk,
Piotr Jarosik, Mikhail Pavlov,
Sergey Kolesnikov, et al.
2018.
Learning to run challenge solutions: Adapting
reinforcement learning methods for neuromusculoskeletal environments.
In The NIPS’17 Competition: Building
Intelligent Systems. 121–153.
Kidziński et al. (2020)
Łukasz Kidziński,
Carmichael Ong, Sharada Prasanna Mohanty,
Jennifer Hicks, Sean Carroll,
Bo Zhou, Hongsheng Zeng,
Fan Wang, Rongzhong Lian,
Hao Tian, et al. 2020.
Artificial intelligence for prosthetics: Challenge
solutions.
In The NeurIPS’18 Competition.
69–128.
Lee et al. (2022)
Seyoung Lee, Jiye Lee,
and Jehee Lee. 2022.
Learning Virtual Chimeras by Dynamic Motion
Reassembly.
ACM Transactions on Graphics
41, 6 (2022).
Lee et al. (2019)
Seunghwan Lee, Moonseok
Park, Kyoungmin Lee, and Jehee Lee.
2019.
Scalable muscle-actuated human simulation and
control.
ACM Transactions On Graphics
38, 4 (2019).
Lee et al. (2018)
Seunghwan Lee, Ri Yu,
Jungnam Park, Mridul Aanjaneya,
Eftychios Sifakis, and Jehee Lee.
2018.
Dexterous manipulation and control with volumetric
muscles.
ACM Transactions on Graphics
37, 4 (2018).
Lee et al. (2009)
Sung-Hee Lee, Eftychios
Sifakis, and Demetri Terzopoulos.
2009.
Comprehensive biomechanical modeling and simulation
of the upper body.
ACM Transactions on Graphics
28, 4 (2009).
Lee et al. (2010)
Yoonsang Lee, Sungeun
Kim, and Jehee Lee. 2010.
Data-driven Biped Control.
ACM Trans. Graph. 29,
4 (2010).
Lee et al. (2014)
Yoonsang Lee, Moon Seok
Park, Taesoo Kwon, and Jehee Lee.
2014.
Locomotion control for many-muscle humanoids.
ACM Transactions on Graphics
33, 6 (2014).
Levin et al. (2011)
David IW Levin, Benjamin
Gilles, Burkhard Mädler, and
Dinesh K Pai. 2011.
Extracting skeletal muscle fiber fields from noisy
diffusion tensor data.
Medical Image Analysis
15, 3 (2011),
340–353.
Li et al. (2022)
Yuwei Li, Longwen Zhang,
Zesong Qiu, Yingwenqi Jiang,
Nianyi Li, Yuexin Ma,
Yuyao Zhang, Lan Xu, and
Jingyi Yu. 2022.
NIMBLE: a non-rigid hand model with bones and
muscles.
ACM Transactions on Graphics
41, 4 (2022).
Liu and Hodgins (2018)
Libin Liu and Jessica
Hodgins. 2018.
Learning basketball dribbling skills using
trajectory optimization and deep reinforcement learning.
ACM Transactions on Graphics
37, 4 (2018).
Liu et al. (2016)
Libin Liu, Michiel Van De
Panne, and KangKang Yin.
2016.
Guided learning of control graphs for physics-based
characters.
ACM Transactions on Graphics
35, 3 (2016).
Liu et al. (2010)
Libin Liu, KangKang Yin,
Michiel van de Panne, Tianjia Shao, and
Weiwei Xu. 2010.
Sampling-based Contact-rich Motion Control.
ACM Trans. Graph. 29,
4 (2010).
Luo et al. (2020)
Ying-Sheng Luo,
Jonathan Hans Soeseno, Trista Pei-Chun
Chen, and Wei-Chao Chen.
2020.
Carl: Controllable agent with reinforcement
learning for quadruped locomotion.
ACM Transactions on Graphics
39, 4 (2020).
Masoudnia and Ebrahimpour (2014)
Saeed Masoudnia and Reza
Ebrahimpour. 2014.
Mixture of experts: a literature survey.
Artificial Intelligence Review
42, 2 (2014),
275–293.
Matias et al. (2009)
Ricardo Matias, Carlos
Andrade, and António Prieto Veloso.
2009.
A transformation method to estimate muscle
attachments based on three bony landmarks.
Journal of biomechanics
42, 3 (2009),
331–335.
McInnes et al. (2018)
Leland McInnes, John
Healy, and James Melville.
2018.
Umap: Uniform manifold approximation and projection
for dimension reduction.
arXiv preprint arXiv:1802.03426
(2018).
Merel et al. (2019)
Josh Merel, Leonard
Hasenclever, Alexandre Galashov, Arun
Ahuja, Vu Pham, Greg Wayne,
Yee Whye Teh, and Nicolas Heess.
2019.
Neural Probabilistic Motor Primitives for Humanoid
Control. In 7th International Conference on
Learning Representations, ICLR 2019.
Moon et al. (2017)
Seung Jun Moon, Young
Choi, Chin Youb Chung, Ki Hyuk Sung,
Byung Chae Cho, Myung Ki Chung,
Jaeyoung Kim, Mi Sun Yoo,
Hyung Min Lee, and Moon Seok Park.
2017.
Normative values of physical examinations commonly
used for cerebral palsy.
Yonsei Medical Journal
58, 6 (2017),
1170–1176.
Mordatch and Todorov (2014)
Igor Mordatch and Emo
Todorov. 2014.
Combining the benefits of function approximation
and trajectory optimization. In Proceedings of
Robotics: Science and Systems.
Nakada et al. (2018)
Masaki Nakada, Tao Zhou,
Honglin Chen, Tomer Weiss, and
Demetri Terzopoulos. 2018.
Deep learning of biomimetic sensorimotor control
for biomechanical human animation.
ACM Transactions on Graphics
37, 4 (2018).
Ong et al. (2019)
Carmichael F Ong, Thomas
Geijtenbeek, Jennifer L Hicks, and
Scott L Delp. 2019.
Predicting gait adaptations due to ankle
plantarflexor muscle weakness and contracture using physics-based
musculoskeletal simulations.
PLoS computational biology
15, 10 (2019),
e1006993.
Park et al. (2022)
Jungnam Park, Sehee Min,
Phil Sik Chang, Jaedong Lee,
Moon Seok Park, and Jehee Lee.
2022.
Generative GaitNet. In
ACM SIGGRAPH 2022 Conference Proceedings.
Park et al. (2019)
Soohwan Park, Hoseok Ryu,
Seyoung Lee, Sunmin Lee, and
Jehee Lee. 2019.
Learning predict-and-simulate policies from
unorganized human motion data.
ACM Transactions on Graphics
38, 6 (2019).
Paszke et al. (2019)
Adam Paszke, Sam Gross,
Francisco Massa, Adam Lerer,
James Bradbury, Gregory Chanan,
Trevor Killeen, Zeming Lin,
Natalia Gimelshein, Luca Antiga,
Alban Desmaison, Andreas Kopf,
Edward Yang, Zachary DeVito,
Martin Raison, Alykhan Tejani,
Sasank Chilamkurthy, Benoit Steiner,
Lu Fang, Junjie Bai, and
Soumith Chintala. 2019.
PyTorch: An Imperative Style, High-Performance Deep
Learning Library.
In Advances in Neural Information
Processing Systems 32. Curran Associates, Inc.,
8024–8035.
Peng et al. (2018)
Xue Bin Peng, Pieter
Abbeel, Sergey Levine, and Michiel
van de Panne. 2018.
Deepmimic: Example-guided deep reinforcement
learning of physics-based character skills.
ACM Transactions on Graphics
37, 4 (2018).
Peng et al. (2022)
Xue Bin Peng, Yunrong
Guo, Lina Halper, Sergey Levine, and
Sanja Fidler. 2022.
ASE: Large-Scale Reusable Adversarial Skill
Embeddings for Physically Simulated Characters.
41, 4 (2022).
Peng et al. (2021)
Xue Bin Peng, Ze Ma,
Pieter Abbeel, Sergey Levine, and
Angjoo Kanazawa. 2021.
AMP: Adversarial motion priors for stylized
physics-based character control.
ACM Transactions on Graphics
40, 4 (2021).
Rajagopal et al. (2016)
Apoorva Rajagopal,
Christopher L Dembia, Matthew S DeMers,
Denny D Delp, Jennifer L Hicks, and
Scott L Delp. 2016.
Full-body musculoskeletal model for muscle-driven
simulation of human gait.
IEEE transactions on biomedical engineering
63, 10 (2016),
2068–2079.
Ryu et al. (2021)
Hoseok Ryu, Minseok Kim,
Seungwhan Lee, Moon Seok Park,
Kyoungmin Lee, and Jehee Lee.
2021.
Functionality-Driven Musculature Retargeting.
Computer Graphics Forum
40, 1 (2021),
341–356.
Sachdeva et al. (2015)
Prashant Sachdeva,
Shinjiro Sueda, Susanne Bradley,
Mikhail Fain, and Dinesh K Pai.
2015.
Biomechanical simulation and control of hands and
tendinous systems.
ACM Transactions on Graphics
34, 4 (2015).
Si et al. (2014)
Weiguang Si, Sung-Hee
Lee, Eftychios Sifakis, and Demetri
Terzopoulos. 2014.
Realistic biomechanical simulation and control of
human swimming.
ACM Transactions on Graphics
34, 1 (2014).
Sok et al. (2007)
Kwang Won Sok, Manmyung
Kim, and Jehee Lee. 2007.
Simulating biped behaviors from human motion data.
ACM Transactions on Graphics
26, 3 (2007).
Song and Geyer (2015)
Seungmoon Song and
Hartmut Geyer. 2015.
A neural circuitry that emphasizes spinal feedback
generates diverse behaviours of human locomotion.
The Journal of physiology
593, 16 (2015),
3493–3511.
Wang et al. (2012)
Jack M Wang, Samuel R
Hamner, Scott L Delp, and Vladlen
Koltun. 2012.
Optimizing locomotion controllers using
biologically-based actuators and objectives.
ACM Transactions on Graphics
31, 4 (2012).
Winkler et al. (2022)
Alexander Winkler, Jungdam
Won, and Yuting Ye. 2022.
QuestSim: Human Motion Tracking from Sparse Sensors
with Simulated Avatars. In SIGGRAPH Asia 2022
Conference Proceedings.
Won et al. (2020)
Jungdam Won, Deepak
Gopinath, and Jessica Hodgins.
2020.
A scalable approach to control diverse behaviors
for physically simulated characters.
ACM Transactions on Graphics
39, 4 (2020),
33–1.
Won et al. (2021)
Jungdam Won, Deepak
Gopinath, and Jessica Hodgins.
2021.
Control strategies for physically simulated
characters performing two-player competitive sports.
ACM Transactions on Graphics
40, 4 (2021).
Won et al. (2022)
Jungdam Won, Deepak
Gopinath, and Jessica Hodgins.
2022.
Physics-based character controllers using
conditional VAEs.
ACM Transactions on Graphics
41, 4 (2022).
Yang et al. (2022)
Zeshi Yang, Kangkang Yin,
and Libin Liu. 2022.
Learning to use chopsticks in diverse gripping
styles.
ACM Transactions on Graphics
41, 4 (2022).
Ye and Liu (2010)
Yuting Ye and C. Karen
Liu. 2010.
Optimal Feedback Control for Character Animation
Using an Abstract Model.
ACM Trans. Graph. 29,
4 (2010).
Ye et al. (2022)
Yongjing Ye, Libin Liu,
Lei Hu, and Shihong Xia.
2022.
Neural3Points: Learning to Generate Physically
Realistic Full-body Motion for Virtual Reality Users.
arXiv preprint arXiv:2209.05753
(2022).
Yin et al. (2007)
KangKang Yin, Kevin
Loken, and Michiel Van de Panne.
2007.
Simbicon: Simple biped locomotion control.
ACM Transactions on Graphics
26, 3 (2007).
Yu et al. (2019)
Ri Yu, Hwangpil Park,
and Jehee Lee. 2019.
Figure skating simulation from video.
Computer Graphics Forum
38, 7 (2019),
225–234.
Yu et al. (2018)
Wenhao Yu, Greg Turk,
and C Karen Liu. 2018.
Learning symmetric and low-energy locomotion.
ACM Transactions on Graphics
37, 4 (2018).
Zajac (1989)
Felix E Zajac.
1989.
Muscle and tendon: properties, models, scaling, and
application to biomechanics and motor control.
Critical reviews in biomedical engineering
17, 4 (1989),
359–411.