\corrauth

Xiang Li, Department of Automation, Central Main Building, Tsinghua University, Beijing, China 100084

Upper-Limb Rehabilitation with a Dual-Mode Individualized Exoskeleton Robot: A Generative-Model-Based Solution

Yu Chen¹¹affiliationmark: Shu Miao¹¹affiliationmark: Jing Ye²²affiliationmark: Gong Chen²²affiliationmark: Jianghua Cheng³³affiliationmark: Ketao Du³³affiliationmark: and Xiang Li¹¹affiliationmark: ¹¹affiliationmark: Department of Automation, Tsinghua University, China
²²affiliationmark: Shenzhen MileBot Robotics Co., Ltd, China
³³affiliationmark: Department of Rehabilitation, South China Hospital, Medical School, Shenzhen University, China [email protected]

Abstract

Several upper-limb exoskeleton robots have been developed for stroke rehabilitation, but their rather low level of individualized assistance typically limits their effectiveness and practicability. Individualized assistance involves an upper-limb exoskeleton robot continuously assessing feedback from a stroke patient and then meticulously adjusting interaction forces to suit specific conditions and online changes. This paper describes the development of a new upper-limb exoskeleton robot with a novel online generative capability that allows it to provide individualized assistance to support the rehabilitation training of stroke patients. Specifically, the upper-limb exoskeleton robot exploits generative models to customize the fine and fit trajectory for the patient, as medical conditions, responses, and comfort feedback during training generally differ between patients. This generative capability is integrated into the two working modes of the upper-limb exoskeleton robot: an active mirroring mode for patients who retain motor abilities on one side of the body and a passive following mode for patients who lack motor ability on both sides of the body. In addition, the upper-limb exoskeleton robot has three other attractive features. First, it has six degrees of freedom (DoFs), namely five active DoFs and one passive DoF, to assist the shoulder and the elbow joints and cover the full range of upper-limb movement. Second, most of its active joints are driven by series elastic actuators (SEAs) and a cable mechanism, which absorb energy and have low inertia. These compliantly driven high DoFs provide substantial flexibility and ensure hardware safety but require an effective controller. Thus, based on the singular perturbation approach, a model-based impedance controller is proposed to fully exploit the advantages of the hardware. Third, the safety of the upper-limb exoskeleton robot is guaranteed by its hardware and software. Regarding hardware, its SEAs are tolerant to impacts and have high backdrivability. Regarding software, online trajectory refinement is performed to regulate the assistance provided by the upper-limb exoskeleton robot, and an anomaly detection network is constructed to detect and relax physical conflicts between the upper-limb exoskeleton robot and the patient. The performance of the upper-limb exoskeleton robot was illustrated in experiments involving healthy subjects and stroke patients.

keywords:

Upper-limb rehabilitation, diffusion-based trajectory generation, individualized assistance, compliantly actuated and cable-driven exoskeleton

1 Introduction

Refer to caption — Figure 1: Dual-mode upper-limb exoskeleton rehabilitation training. In active mirroring mode, the reference trajectory of the affected side is determined by the motion intention of the unaffected side (the red dashed line). Conversely, in passive following mode, the reference trajectory is pre-programmed (the red dashed line). The generative model in active mirroring mode can predict the motion intentions of the unaffected side and utilize interactive feedback to generate an anomaly score, which is used for trajectory refinement in both modes.

Exoskeleton robots offer several advantages in stroke rehabilitation. In particular, they enable precise control of movement, consistent and repeatable therapy sessions, and collection of real-time data on patient progress (Huang and Krakauer, 2009; Jezernik et al., 2004; Said et al., 2022). Furthermore, they offer high-intensity training while reducing physical burdens on therapists, and they are adaptable to the needs and recovery stages of individual patients (Gull et al., 2020). This paper focuses on rehabilitation using upper-limb exoskeleton robots, a training task that is the focus of several prototypical or commercially available products. Among these products, the Harmony (Kim and Deshpande, 2017) stands out, as it is equipped with an anatomical shoulder mechanism designed to augment and facilitate the arm’s natural movements. Additionally, ANYexo (Zimmermann et al., 2023a) enhances the upper limb mobility of patients and exhibits sufficient flexibility to encompass a wide array of daily activities. Furthermore, the Armeo Power (Hocoma) (Lee et al., 2020) enables early-stage stroke patients to start intensive arm therapy. However, current upper-limb exoskeleton robots may not fully achieve the main goal of upper-limb rehabilitation, i.e., recovery of the manipulability of the human body with high degrees of freedom (DoFs), as they have too few DoFs to match the movement of a healthy subject and a rather low capacity for individualization. Individualization is a key feature of an upper-limb exoskeleton robot used in rehabilitation, as it enables the provision of assistance that is customized to the condition (e.g., stroke duration and medical background) of the patient. Such assistance helps the patient to regain movement ability that is original and natural, i.e., devoid of abnormal patterns. Hence, insufficient individualization results in low-quality rehabilitation training.

Generally, either active mirroring or passive following training modes are applied in the rehabilitation of stroke patients, according to their medical condition and as illustrated in Figure 1. Active mirroring training is used for patients who have mild impairments and an unaffected side of the body and thus retain motor abilities. In this mode, a generative model known as the intention predictor estimates the patient’s motion intentions based on historical motion data. Subsequently, these intentions are mirrored by an upper-limb exoskeleton robot on the affected side of the patient’s body. This process aims to align motion intentions between the unaffected and affected sides of the body at a neural level, thereby increasing patient engagement. Passive following training is used for patients with severely impaired motor abilities. In this mode, the patient follows a pre-defined trajectory facilitated by an upper-limb exoskeleton robot, which provides structured guidance for movement rehabilitation. The aforementioned modes can also be utilized together with manual training for patients with different medical conditions or in different stages of stroke. For example, using the passive mode at the early stage and then the active one, when the healthy side can move naturally. Such dual-mode rehabilitation is important for ensuring that an upper-limb exoskeleton robot can be used by multiple patients with various levels of physical ability.

As patients wear an upper-limb exoskeleton robot to carry out training procedures, ensuring patient safety in the presence of tight and continuous physical interactions is a primary concern. Much progress has been made in this regard, in terms of both hardware and software. In terms of hardware, safety is increased by using compliant actuators such as series elastic actuators (SEAs) (Pratt and Williamson, 1995) to absorb impact forces during interactions. Additionally, bio-inspired cable-driven actuators (Xu et al., 2023) are used to transition between different actuation modes, and safety is further guaranteed by backdrivable actuators (Zhu et al., 2021). In terms of software, safety regulation can be achieved through various control strategies, including impedance control (Li et al., 2018b), velocity field-based control (Martinez et al., 2018), and data-driven ergonomic control (Clark and Amor, 2022), all of which ensure that an upper-limb exoskeleton robot system operates safely. However, most hardware and software approaches are applied after a safety problem has occurred. A better approach is to predict potential safety problems and prevent their occurrence.

Accordingly, this paper describes the development of a new upper-limb exoskeleton robot for rehabilitation training of stroke patients that is superior to previously reported upper-limb exoskeleton robots in terms of the three characteristics described below.

1.

Safety: The new upper-limb exoskeleton robot contains SEAs in most of its active joints, so that the joints absorb impact forces caused by collisions and unexpected contacts during upper-limb interaction. This design enhances the precision of torque control, thereby providing smoother assistance in rehabilitation training than previously reported upper-limb exoskeleton robots. Moreover, an anomaly detection neural network informed by interactive feedback is implemented to assess the safety and naturalness of the movement of the upper-limb exoskeleton robot in real time. This network enables the early detection of anomalies, allowing for their prevention or mitigation during rehabilitation tasks. Furthermore, an online trajectory refinement module and an impedance controller are incorporated into the upper-limb exoskeleton robot to ensure safety throughout the planning and execution phases of rehabilitation.
2.

Effectiveness: The upper-limb exoskeleton robot is configured with one passive joint and five active joints, endowing it with a high number of DoFs and thus enabling the workspace to accommodate a broad spectrum of daily activities. Given the inherent uncertainty and randomness associated with human motion intentions and upper-limb movements, the system efficiently generates trajectories by exploiting a probabilistic model of motion for explicit consideration during the sampling process so as to match the nature of human intention. The effectiveness of the upper-limb exoskeleton robot was validated through clinical trials, which demonstrated that the experimental group, which underwent passive following training, recovered motor abilities more quickly than the control group.
3.

Friendliness: The active joints of the upper-limb exoskeleton robot are capable of sensing force and thus can detect interaction torque. Therefore, when the upper-limb exoskeleton robot is operated in transparent mode for demonstration purposes, it can be maneuvered effortlessly by the patient. Moreover, given the substantial load sustained by the shoulder joint during upper-limb movements, a direct-drive motor is incorporated into this joint in the upper-limb exoskeleton robot. This motor delivers enhanced driving torque and increases backdrivability by eliminating nonlinear friction forces. Furthermore, to reduce the effects of inertia during movement and thus increase comfort, a cable-driven mechanism is utilized in joints that must have a broader range of motion in space than other joints.

The key novelty of this study is its design and implementation of a generative-model-based refinement framework that is capable of generating highly individualized trajectories and ensuring safety. Specifically, the framework operates in either of the two aforementioned distinct modes, namely active mirroring mode and passive following mode.

The active mirroring mode exhibits the newly developed key features described below.

-

A novel diffusion model-based motion intention predictor, which uses historical motion data of the patient’s unaffected side to estimate the patient’s upper-limb motion intentions. This allows for the prediction of future trajectories and establishes the mean and variance of the patient’s motion intentions.
-

A preemptive tuning algorithm that ensures that a predicted trajectory remains within a safe region by exploiting the predicted distribution to mitigate potential risks.

The passive following exhibits the newly developed key features described below.

-

A diffusion model-based anomaly detection network capable of evaluating safety and naturalness based on an anomaly score. This score is utilized to guide the online trajectory refinement and evaluation for each rehabilitation task.
-

A probabilistic movement primitives (ProMPs)-based approach to trajectory generation that captures the distribution of the patient’s motion intention. In addition, based on each sampling result’s performance, this approach iteratively optimizes the assistive trajectory.

Both modes customize the assistance and generate an individualized trajectory for the patient, enabling natural and original motion patterns to be recovered by the patient. In particular, the intention predictor and anomaly detector utilize generative models to capture patterns in the latent space of large datasets. Moreover, the intention predictor and anomaly detector have distinct functions. The intention predictor accurately predicts the patient’s original motion intentions. This prediction serves as a self-mirrored trajectory that is finely tailored to the needs of the patient’s affected side and thus aligns closely with the patient’s motion intentions. Simultaneously, the anomaly detector quantitatively identifies the differences between the patient’s rehabilitation movements and those of healthy individuals. These differences are used to guide the adjustment of rehabilitation exercises, thereby enhancing the individualization of the treatment.

In a previous study (Chen et al., 2023b), we developed an individualization framework for passive following training using a variational autoencoder (VAE)-based anomaly detector. In the current study, we expand this approach to a dual-mode upper-limb exoskeleton robot by incorporating generative model technology, i.e., by utilizing a diffusion model. We present results from a series of experiments and comparisons that confirm the safety and performance of the new upper-limb exoskeleton robot. In addition, we present results from a clinical trial that validate its effectiveness and friendliness.

2 Related Works

This section reviews related work on upper-limb exoskeleton robots, trajectory generation, and interaction control. Unlike those that have been previously reported, the upper-limb exoskeleton robot reported in this paper has an efficient generative capability, which enables it to effectively provide individualized assistance.

Table 1: Comparison of Different Upper-Limb Exoskeleton Robots

EXOSKELETON	MOVEMENT	NO.OF DOFs	SENSOR	ACTUATION
Wu et al. (2016)	Shoulder,
Elbow	Four-Active
One-Passive	Encoders	Direct
Zimmermann et al. (2023b)	Shoulder,
Elbow,
Forearm	Nine-Active	Encoders
Force sensors	Compliant
Mao and Agrawal (2012)	Shoulder,
Elbow,
Forearm	Five-Active	Encoders
Force sensors	Cable-driven
Ours	Shoulder,
Elbow,
Forearm	Five-Active
One-Passive	Encoders
Force sensors	Cable-driven and compliant

2.1 Upper-Limb Exoskeleton Robots

Upper-limb rehabilitation necessitates a large range of motion. Therefore, upper-limb exoskeleton robots must possess a sufficient number of DoFs to accommodate dynamic and complex upper-limb movements. The redundancy provided by multiple DoFs can enhance patient comfort (Kim et al., 2012). However, many upper-limb exoskeleton robots have been primarily designed for the rehabilitation of single joints, such as the shoulder (Nasr et al., 2023), elbow (Chen et al., 2019), or wrist (Martinez et al., 2013). In contrast, our newly developed upper-limb exoskeleton robot is designed to fulfill the requirements of most upper-limb rehabilitation tasks and thus features one passive joint and five active joints. It also features advanced sensory capabilities and a lightweight mechanical design. Table 1 compares our upper-limb exoskeleton robot with previously reported designs.

Unlike traditional upper-limb exoskeleton robots, which are directly motor-driven, our design incorporates a cable-driven mechanism. Thus, it has a significantly lower weight and inertia effect than traditional upper-limb exoskeleton robots (Perry et al., 2007). The cable-driven mechanism also offers the advantages of flexibility, low-backlash gearing, and backdrivable transmission (Jau, 1988). However, the friction generated during cable transmission introduces nonlinear characteristics that complicate joint control (Wang et al., 2022). Compared with other joints, the shoulder joint requires a higher driving torque and thus exhibits higher friction. Thus, this joint requires additional enhancements to optimize its performance.

During exoskeleton robot-supported rehabilitation, there is tight and frequent human–robot interaction and thus compliant actuation is essential. In particular, the use of compliant actuation ensures flexibility, mechanical safety, and good interaction performance (Tiboni et al., 2022; Kalita et al., 2021). As such, many upper-limb exoskeleton robots have used SEAs (Ebrahimi et al., 2017; Pan et al., 2022; Li et al., 2021). Accordingly, we incorporate SEAs within the cable-driven mechanism of our upper-limb exoskeleton robot to improve its interaction performance.

Previously developed upper-limb exoskeleton robots have demonstrated good flexibility. However, few have simultaneously incorporated cable-driven mechanisms and compliant actuator designs (to offer sufficient numbers of DOFs for expansive rehabilitation training) and force sensors (to offer enhanced sensory capabilities). Furthermore, the combination of cable drive and SEA structures creates a system with high-order nonlinear dynamics, thereby presenting substantial control challenges.

2.2 Trajectory Generation

The assistance provided by an upper-limb exoskeleton robot can be represented by the assistive (i.e., desired) trajectory that it follows to interact with the stroke patient. For the two modes of training considered in this study, the assistive trajectory is generated in different ways.

In active mirroring mode, the aim of assistance is to replicate the motion of the unaffected side of the patient’s body, and the assistive trajectory is generated as a sequence of predicted human motion intentions. Various methods have been developed to determine human motion intentions from sensor information, such as that generated by inertial measurement units (Zhu et al., 2020), force-sensing resistors (Huang et al., 2015), and surface electromyography (Lenzi et al., 2012). Additionally, Gaussian process regression was employed to estimate human motion intention from human–robot interaction data (Long et al., 2018). Furthermore, a multi-sensor fusion-based method was proposed to adaptively update an assistance profile to enhance mirror training rehabilitation (Li et al., 2022). Moreover, learning-based approaches that exploit neural networks’ capability to incorporate multimodal data and their excellent expressive capabilities have been successfully implemented in mirror training rehabilitation (Xu et al., 2020; Li et al., 2023a). However, the above-mentioned methods have primarily been designed to focus on immediate motion intentions and thus struggle to predict motion intention trends over time. Given that the affected side of the body of a stroke patient has limited motion ability, failing to predict motion intention may result in crucial factors such as safety constraints being overlooked, thereby posing risks to the patient during rehabilitation training.

In passive tracking mode, the aim of assistance is to enable the patient to follow a pre-defined trajectory. In this mode, the simplest method for trajectory generation involves discretizing the trajectory into multiple waypoints and then performing linear interpolation of discrete points and trajectory constraints within a planning framework (Sommerhalder et al., 2023). Trajectories may also be parameterized using polynomials and dynamic movement primitives to meet the requirements of different assistive tasks (Kagawa et al., 2015; Lanotte et al., 2021; Qiu et al., 2020). Additionally, assistance can be provided through torque profiles generated in a human-in-the-loop manner (Zhang et al., 2017) or determined via reinforcement learning (Zhang et al., 2022). However, the above-mentioned methods typically optimize feasible trajectories at only the kinematic level or by using metabolic measurements, which are labor-intensive to obtain. Thus, these methods fail to simultaneously effectively incorporate safety constraints and personalize a trajectory, despite the former being the primary concern for an exoskeleton robot, and the latter being important for ensuring that recovered motion is natural and original.

2.3 Interaction Control

Hardware development for exoskeleton robots driven by compliant actuators is well advanced. However, controller development for such exoskeleton robots remains challenging, because their cable-driven mechanism forms a high-order system with significant nonlinearity. To mitigate this nonlinearity, which is caused by the coupling of compliant actuators and rigid joints, several control strategies have been devised for accurate and flexible interaction control. For instance, the backstepping control method (Pan et al., 2017; Li et al., 2018b) has been devised to hierarchically deliver desired control commands, enabling position or impedance control. Additionally, singular perturbation theory (Spong, 1987) eliminates the need for high-order motion information. Thus, a method based on singular perturbation has been developed that utilizes the intrinsic differences between two subsystems on exoskeleton robots equipped with SEAs. This method has been used to generate a multimodal control scheme (Li et al., 2017) and for adaptive trajectory tracking (Han et al., 2023). Moreover, a recent advancement is a learning-based method (Sambhus et al., 2023) that further refines interaction control capabilities.

Exoskeleton robots featuring cable-driven mechanisms can have lower weights than those featuring other mechanisms. However, the substantial friction generated during cable transmission and movement may hinder effective interaction control. Multiple methodologies have been developed to alleviate friction and other disturbances in human-robot interactions. For instance, Li et al. (2018a) employed an iterative learning control algorithm that compensates for disturbances, including friction, at both motor and joint ends by accurately fitting disturbances. Li et al. (2023b) used a pulley model to obtain an analytical expression for the friction and the tension output from a Boden cable. This model facilitated adaptive learning of friction parameters through the system’s dynamic structure to effectively counteract the above-mentioned disturbances. Additionally, Wang et al. (2022) demonstrated that friction parameters can be obtained via nonlinear fitting techniques.

However, despite significant advancements in the management of SEA systems and the mitigation of disturbances in cable transmission, it remains challenging to integrate these capabilities to enable flexible interaction control. Moreover, in cable-driven exoskeleton robots, disturbance signals and human-machine interaction forces are often overlapping. Thus, there remains a need for a systematic consideration of all of the aforementioned aspects combined with a rigorous proof of the stability of a closed-loop system, as this would enable a full exploration of the advantages of hardware (i.e., high numbers of DoFs, and cable-driven and compliant actuation).

3 Overall Structure

This study examined a prototype cable-driven upper-limb exoskeleton robot that was co-developed by Tsinghua University and Shenzhen MileBot Robotics Co., Ltd. A computer-aided design model of the upper-limb exoskeleton robot is illustrated in Figure 2. The upper-limb exoskeleton robot is equipped with one passive and five active joints that allow refinement of the upper-arm and forearm lengths within ranges of 20 and 40 cm, respectively. The passive DoF, labeled as Joint 0, accommodates the eccentric movements of the shoulder joint and thereby eliminates the limitations of the upper-limb motion range and increases the space for patient motion training. The active joints are numbered 1 to 5 and serve specific functions, as described below.

-

Joint 1: shoulder abduction and adduction
-

Joint 2: shoulder flexion and extension
-

Joint 3: upper-arm internal and external rotation
-

Joint 4: elbow flexion and extension
-

Joint 5: forearm internal and external rotation

A block diagram of the upper-limb exoskeleton robot is shown in Figure 3. Joints 1 and 2 consist of direct-drive joint modules (RJSIIT-17-RevB2 and RJSIIT-17-RevB5, respectively), each of which is outfitted with force sensors and supported by the frame. Joints 3 to 5 utilize harmonic deceleration servo motors (AK80-64) that are configured as cable-driven SEAs and equipped with torque sensors (TK17-191151) and encoders (3590S-2-104L for the internal and external rotation joints, and QY2204-SSI for the elbow joint). Additionally, these joints incorporate two potentiometers for measuring spring compression during motion. The detailed technical specifications of the upper-limb exoskeleton robot are provided in Table 2. Motor drivers are implemented to control Joints 3–5, and together with two joint modules, facilitate communication with the main board through a controller area network. The main board receives commands from a personal computer (PC) via a universal asynchronous receiver/transmitter.

The upper-limb exoskeleton robot employs a hierarchical control system. At the low level, the main board executes motion control and environmental perception, enabling precise force control and real-time acquisition of sensory data. At the high level, a Linux system on the PC operates a motion planning module and performs computationally intensive tasks, such as dynamic calculation and network inference. All programs are integrated into Robot Operating System (ROS) (Quigley et al., 2009) and run concurrently.

Table 2: Specifications of the Upper-Limb Exoskeleton

Property	Value
Continuous torque	49 N $\cdot$ m @ Joint 1 and 2
	48 N $\cdot$ m @ Joint 3,4, and 5
Backdrivability	Less than 0.7 N $\cdot$ m @ 10 ^∘/s
Weight	7.78 kg excluding the frame
Control frequency	1000Hz

The joints that undergo significant spatial movement, namely Joints 3, 4, and 5, utilize our self-developed SEAs equipped with cable-driven mechanisms. A working principle diagram of a SEA is provided in Figure 4. Each joint motor is outfitted with two sets of pulleys, with each set corresponding to a different rotational direction. Each end of the motor output shaft is affixed with a steel cable that is connected to a potentiometer to constitute the SEA. The potentiometer assembly includes two fixed blocks, namely fixed block 1 and fixed block 2, and a spring component. The cable from the motor end (depicted in red) is connected via a spring to fixed block 1, while another cable (depicted in blue) connects the moving joint components to fixed block 2. As the motor applies torque, fixed block 1 moves along a sleeve attached to fixed block 2, thereby compressing the spring and enabling the measurement of tension within the cable. Upon spring compression, the blocks form a unified structure that moves in synchrony with and in the same direction as the joint, thereby facilitating overall movement.

For safety, the upper-limb exoskeleton robot is equipped with an emergency stop button that physically disconnects power during urgent situations. Additional safety measures are incorporated at the software level, including measures that limit the range of motion, prevent self-collision, restrict joint velocity, and automatically cease operation in cases of excessive interaction force or torque.

A dynamic model of a SEA-driven upper-limb exoskeleton robot can be defined as follows (Albu-Schäffer et al., 2007):

$\displaystyle\bm{M}(\bm{q})\ddot{\bm{q}}+\bm{C}(\dot{\bm{q}},\bm{q})\dot{\bm{q}}+\bm{g}(\bm{q})$	$\displaystyle=\bm{S}_{1}\bm{u}+\bm{S}_{2}^{\mathsf{T}}\bm{K}(\bm{\theta}-\bm{S}_{2}\bm{q})+$
	$\displaystyle\quad\bm{\tau}_{e}+\bm{S}_{2}^{\mathsf{T}}\bm{\tau}_{f},$	(1)
$\displaystyle\bm{B}\ddot{\bm{\theta}}+\bm{K}(\bm{\theta}-\bm{S}_{2}\bm{q})$	$\displaystyle=\bm{S}_{2}\bm{u},$	(2)

where $\bm{q}\in\Re^{5}$ is the vector of joint positions; $\bm{\theta}\in\Re^{3}$ is the vector of motor-rotor-shaft positions; and $\bm{M}(\bm{q})\in\Re^{5\times 5}$ , $\bm{C}(\dot{\bm{q}},\bm{q})\dot{\bm{q}}\in\Re^{5\times 5}$ , and $\bm{g}(\bm{q})\in\Re^{5}$ are the inertia matrix, centripetal and Coriolis torques, and gravitational torques of the robot, respectively. The selection matrices $\bm{S}_{1}=\text{diag}(1,1,0,0,0)\in\Re^{5\times 5}$ and $\bm{S}_{2}=[\bm{0},\bm{I}_{3}]\in\Re^{3\times 5}$ are used to separate the cable-driven joints. In addition, $\bm{K}\in\Re^{3\times 3}$ is the stiffness matrix, $\bm{B}\in\Re^{3\times 3}$ is the inertia matrix of the motor, $\bm{\tau}_{e}\in\Re^{5}$ is the physical interaction torque vector, $\bm{\tau}_{f}\in\Re^{3}$ is the disturbance due to the cable transmission and the friction of the joint system, and $\bm{u}\in\Re^{5}$ is the control torque applied to the actuator.

The dynamic model of the overall system is described by (1) and (2) and exhibits different time-scales. Specifically, the SEA subsystem operates on a fast time-scale, while the upper-limb exoskeleton robot subsystem functions on a slow time-scale. To control this system effectively, the control input is formulated according to singular perturbation theory (Spong, 1987), as follows:

\displaystyle\bm{u}=\bm{u}_{f}+\bm{u}_{s},

(3)

where $\bm{u}_{f}$ is the fast time-scale control term for stabilizing the model defined by (2), and $\bm{u}_{s}$ is the slow time-scale control term for stabilizing the model defined by (1). A representative form of $\bm{u}_{f}$ is expressed as follows:

\displaystyle\bm{u}_{f}=-\bm{S}_{2}^{\mathsf{T}}\bm{K}_{v}(\dot{\bm{\theta}}-\bm{S}_{2}\dot{\bm{q}}),

(4)

where $\bm{K}_{v}\in\Re^{3\times 3}$ is a diagonal and positive-definite matrix.

Substituting (3) and (4) into (2) yields

\displaystyle\bm{B}\ddot{\bm{\theta}}+\bm{K}(\bm{\theta}-\bm{S}_{2}\bm{q})+\bm{K}_{v}(\dot{\bm{\theta}}-\bm{S}_{2}\dot{\bm{q}})=\bm{S}_{2}\bm{u}_{s},

(5)

which can be rewritten as follows by defining $\bm{\tau}_{o}=\bm{K}(\bm{\theta}-\bm{S}_{2}\bm{q})$ :

\displaystyle\bm{B}\ddot{\bm{\tau}}_{o}+\bm{K}_{v}\dot{\bm{\tau}}_{o}+\bm{K}\bm{\tau}_{o}=\bm{S}_{2}\bm{u}_{s}-\bm{B}\bm{S}_{2}\ddot{\bm{q}}.

(6)

By introducing $\bm{K}=\bm{K}_{1}/\varepsilon^{2}$ and $\bm{K}_{v}=\bm{K}_{2}/\varepsilon$ , with $\varepsilon$ being a small positive parameter, (6) can be written as follows:

\displaystyle\varepsilon^{2}\bm{B}\ddot{\bm{\tau}}_{o}+\varepsilon\bm{K}_{2}\dot{\bm{\tau}}_{o}+\bm{K}_{1}\bm{\tau}_{o}=\bm{K}_{1}(\bm{S}_{2}\bm{u}_{s}-\bm{B}\bm{S}_{2}\ddot{\bm{q}}).

(7)

When $\varepsilon=0$ , the solution of (7) is $\bar{\bm{\tau}}_{o}=\bm{S}_{2}\bm{u}_{s}-\bm{B}\bm{S}_{2}\ddot{\bm{q}}$ .

If the fast time-scale is set as $\gamma=\frac{t}{\varepsilon}$ , $\bar{\bm{\tau}}_{o}$ is achieved at $\gamma\rightarrow\infty$ . $\bar{\bm{\tau}}_{o}$ remains constant at $\varepsilon=0$ . Next, we introduce a new variable $\bm{\eta}=\bm{\tau}_{o}-\bar{\bm{\tau}}_{o}$ to rewrite (7) on the fast time-scale, as follows:

\displaystyle\bm{B}(\frac{\rm d^{2}\bm{\eta}}{\rm d\gamma^{2}})+\bm{K}_{2}(\frac{\rm d\bm{\eta}}{\rm d\gamma})+\bm{K}_{1}\bm{\eta}=\bm{0},

(8)

which defines the boundary-layer system.

Substituting the solution of (8) into (1), affords a quasi-steady-state system that captures the slow dynamics of the overall system. These dynamics can be expressed as follows:

\displaystyle(\bm{M}(\bm{q})+\bar{\bm{B}})\ddot{\bm{q}}+\bm{C}(\dot{\bm{q}},\bm{q})\dot{\bm{q}}+\bm{g}(\bm{q})=

\displaystyle\bm{u}_{s}+\bm{\tau}_{e}+\bm{S}_{2}^{\mathsf{T}}\bm{\tau}_{f},

(9)

where $\bar{\bm{B}}=\bm{S}_{2}^{\mathsf{T}}\bm{B}\bm{S}_{2}$ is the projected motor inertia matrix.

According to singular perturbation theory, the stability of the overall system is guaranteed if the boundary-layer system and the quasi-steady-state system are both exponentially stable. A stability analysis of this system is provided in the Appendix.

The overall dynamic model described by (1) and (2) is a high-order system, where (1) represents the rigid-joint side, and (2) represents the compliant actuator side. It is non-trivial to stabilize and control such a system.

An upper-limb exoskeleton robot needs to guide the patient to perform repetitive motions via close interaction to help the patient to regain motor function. Thus, the upper-limb exoskeleton robot is controlled to track a time-varying trajectory in accordance with the following impedance model:

\displaystyle\bm{M}_{d}(\ddot{\bm{q}}-\ddot{\bm{q}}_{d})+\bm{C}_{d}(\dot{\bm{q}}-\dot{\bm{q}}_{d})+\bm{K}_{d}(\bm{q}-\bm{q}_{d})=\bm{\tau}_{e},

(10)

where $\bm{q}_{d}\in\Re^{n}$ is the vector of the desired trajectory, and $\bm{M}_{d},\bm{C}_{d},\bm{K}_{d}\in\Re^{n\times n}$ are the desired inertia, the desired damping, and the desired stiffness matrices, respectively, which are diagonal and positive-definite. Tracking the desired trajectory in accordance with the impedance model allows the patient to deviate from the trajectory and hence provides a certain level of compliance to improve safety.

The proposed dual-mode training framework is illustrated in Figure 5. Both the anomaly detector and intention predictor are designed using trained generative models based on a dataset of motions of healthy individuals. These models are designed to capture the motion intentions of the unaffected side of the patient and evaluate the comfort and naturalness of the current patient–robot interaction. Individualization is facilitated by employing two methods, i.e., employing the anomaly score to guide the refinement of a desired trajectory that complies with dynamic constraints, and employing ProMPs to integrate various assistive trajectories into a personalized assistance distribution. Additionally, the safety of the motion process is enhanced by adjusting the impedance parameters according to the anomaly score, thereby ensuring a secure and effective rehabilitation environment.

4 Online Trajectory Refinement

This section introduces the new trajectory generation method for the upper-limb exoskeleton robot. This method is based on the generative models and can be refined online to suit the patient while maintaining safety and individualized features, and hence maintaining effectiveness. First, in both training modes (i.e., active mirroring and passive following modes), we introduce a general integrated vector $\bm{x}_{d}=[\bm{q}_{d}^{\mathsf{T}},\dot{\bm{q}}_{d}^{\mathsf{T}}]^{\mathsf{T}}$ , which encapsulates a sequence of discretized trajectory points within the joint space, thereby parameterizing the trajectory. The following quadratic programming problem is formulated to describe the planning process:

$\displaystyle\min_{\bm{x}_{d},\bm{u}_{d},s}$	$\displaystyle\sum_{i=t}^{t+N_{p}}\left[\\|\bm{q}_{d}^{(i)}-\bm{q}_{r}^{(i)}\\|_{\bm{Q}}^{2}+\\|\bm{u}_{d}^{(i)}\\|_{\bm{R}}^{2}+(s^{(i)})^{2}\right],$	(11)
$\displaystyle s.t.\quad$	$\displaystyle{\bm{x}}_{d}^{(t+1)}=\begin{bmatrix}\bm{I}&\bm{I}\Delta t\\ \bm{0}&\bm{I}\end{bmatrix}\bm{x}_{d}^{(t)}+\begin{bmatrix}\bm{0}\\ \bm{I}\Delta t\end{bmatrix}\bm{u}_{d}^{(t)}$
	$\displaystyle s^{(t+1)}=s^{(t)}+\begin{bmatrix}\bm{0}\\ -(\frac{\partial f}{\partial\bm{\tau}_{e}})^{\mathsf{T}}\bm{K}_{a}\Delta t\end{bmatrix}\bm{x}_{d}^{(t)}$
	$\displaystyle\qquad\qquad-(\frac{\partial f}{\partial\bm{\tau}_{e}})^{\mathsf{T}}\bm{C}_{a}\bm{u}_{d}^{(t)}\Delta t$
	$\displaystyle\bm{x}_{d}\in\mathcal{X},\bm{u}_{d}\in\mathcal{U}$

where $N_{p}$ is the predictive horizon, $\bm{Q}$ and $\bm{R}$ are symmetric positive-definite weighting matrices, $\bm{u}_{d}$ is the acceleration of the desired trajectory, $\bm{C}_{a}$ and $\bm{K}_{a}$ are the varying impedance parameters to be defined later, and $f(\cdot)$ is a functional representation of the anomaly detection network. In addition, the superscript indicates the time-step, and $\mathcal{X}$ and $\mathcal{U}$ are the sets of trajectory space constraints and acceleration constraints, respectively. Thus, (11) enables various constraints and safety measures to be included in both training modes.

4.1 Active Training Mode

In cases of mild motor impairment, such as that observed in early-stage stroke, patients retain functional motion capabilities on their unaffected side. Thus, the upper-limb exoskeleton robot must accurately estimate the motion intentions of the patient’s unaffected side to provide mirrored support for the patient’s affected side. This mirroring process enables an affected limb to emulate the movement of an affected limb, thereby enhancing the recovery of motor function.

The intention estimation system inputs a series of $N$ historical observations, defined as $\bm{o}_{(i)}=\{\bm{q}_{h(i)}^{(t)}\in\Re^{5}|t=-N_{o},-N_{o}+1,\cdots,0\},\forall i\in\{1,2,\cdots,N\}$ , where $N_{o}$ is the number of past time steps, and $\bm{q}_{h(i)}^{(t)}$ is the position of the unaffected side of the body at time step $t$ . The system also generates a prediction of the motion intention, expressed as $\bm{p}_{(i)}=\{{\bm{q}}_{h(i)}^{(t)}\in\Re^{5}|t=1,2,\cdots,N_{p}\}$ . To estimate the potential distribution of plausible trajectories, we introduce a probabilistic predictor based on denoising diffusion probabilistic models (DDPMs) (Ho et al., 2020), as DDPMs can effectively describe the uncertain nature of human motion intention. For clarity, we omit the subscript $(i)$ in the following subsections and refer to the past and future trajectories as $\bm{o}$ and $\bm{p}$ , respectively.

Intention Predictor: The role of the intention predictor is to produce probabilistic forecasts of future trajectories. As such, the intention predictor is informed by an initial, ambiguous predicted trajectory $\bm{p}_{T}$ , which represents a noise-perturbed estimate of the future trajectory $\bm{p}_{0}$ after $T$ diffusion steps, where $T$ is the pre-defined maximum number of diffusion steps. The predictor is designed to infer the reverse diffusion process, articulated as the sequence $(\bm{p}_{T},\bm{p}_{T-1},\cdots,\bm{p}_{0})$ . This sequence is mathematically structured as a Markov chain that is characterized by Gaussian transition probabilities and refines the prediction by reducing the initial uncertainty. Furthermore, the intention predictor incorporates historical observations encoded into a context vector $\bm{c}$ , which is synthesized via a neural network parametrized by $\phi$ . This context vector informs the trajectory generation process. The principle of the intention predictor is depicted in Figure 6. Within this framework, the upper-limb exoskeleton robot is endowed with the ability to precisely estimate the future trajectory based on a simple sampling from the initial noisy prediction. Importantly, a future trajectory represents the patient’s motion intention and is not based on direct measurements of the patient’s unaffected limb. Thus, this approach generates a self-mirrored trajectory that closely aligns with the patient’s original movement patterns, thereby representing a personalized trajectory.

The posteriors associated with diffusion and reverse processes are given as follows:

	$\displaystyle q(\bm{p}_{t}\|\bm{p}_{t-1})$	$\displaystyle=\mathcal{N}(\bm{p}_{t};\sqrt{1-\beta_{t}}\bm{p}_{t-1},\beta_{t}\bm{I}),$		(12)
	$\displaystyle q_{\psi}(\bm{p}_{t-1}\|\bm{p}_{t},\bm{c})$	$\displaystyle=\mathcal{N}(\bm{p}_{t-1};\bm{\mu}_{\psi}(\bm{p}_{t},\bm{c}),\tilde{\beta}_{t}\bm{I}),$		(13)

where $\beta_{1},\beta_{2},\cdots,\beta_{T}$ is the variance schedule used to adjust the injected noise. The adjusted variance, $\tilde{\beta}_{t}$ , is defined as $\tilde{\beta}_{t}=\frac{1-\bar{\alpha}_{t-1}}{1-\bar{\alpha}_{t}}\beta_{t}$ , where this formulation is derived by adopting the definitions $\alpha_{t}=1-\beta_{t}$ and $\bar{\alpha}_{t}=\prod_{i=1}^{t}\alpha_{i}$ .

The objective of the predictor is to maximize the log-likelihood, denoted as $\mathbb{E}[\log q_{\psi}(\bm{p}_{0})]$ . Given that direct computation of this metric is infeasible, a variational lower bound is utilized as a surrogate for the training loss:

	$\displaystyle\mathcal{L}_{vlb}(\phi,\psi)=$	$\displaystyle\sum_{t=2}^{T}D_{KL}\left(q(\bm{p}_{t-1}\|\bm{p}_{t})\\|q_{\psi}(\bm{p}_{t-1}\|\bm{p}_{t},\bm{c})\right)$
		$\displaystyle-\log q_{\psi}(\bm{p}_{0}\|\bm{p}_{1},\bm{c}),$		(14)

where $D_{KL}(\cdot)$ signifies the Kullback–Leibler divergence function.

In the proposed methodology, the conditional probability $q(\bm{p}_{t-1}|\bm{p}_{t})$ is reformulated as $q(\bm{p}_{t-1}|\bm{p}_{t},\bm{p}_{0})$ . This reformulation allows the following closed-form expression to be devised:

\displaystyle q(\bm{p}_{t-1}|\bm{p}_{t},\bm{p}_{0})=

\displaystyle\mathcal{N}(\bm{p}_{t-1};\tilde{\bm{\mu}}_{t}(\bm{p}_{t},\bm{p}_{0}),\tilde{\beta}_{t}\bm{I}).

(15)

By iteratively employing parameterization techniques in the diffusion process (24), the mean of the posterior distribution in (15) and the reverse process (13) can be articulated as follows:

	$\displaystyle\tilde{\bm{\mu}}_{t}(\bm{p}_{t},\bm{p}_{0})=$	$\displaystyle\frac{1}{\sqrt{\alpha_{t}}}(\bm{p}_{t}-\frac{\beta_{t}}{\sqrt{1-\bar{\alpha}_{t}}}\bm{\epsilon}_{t}),$		(16)
	$\displaystyle\bm{\mu}_{\psi}(\bm{p}_{t},\bm{c})=$	$\displaystyle\frac{1}{\sqrt{\alpha_{t}}}(\bm{p}_{t}-\frac{\beta_{t}}{\sqrt{1-\bar{\alpha}_{t}}}\bm{\epsilon}_{\psi}(\bm{p}_{t},\bm{c})),$		(17)

where $\bm{\epsilon}\thicksim\mathcal{N}(\bm{0},\bm{I})$ , and the loss function applied in active mirroring mode is further simplified as follows

\displaystyle\mathcal{L}_{a}(\phi,\psi)=\mathbb{E}_{t,\bm{p}_{0},\bm{\epsilon}}[\|\bm{\epsilon}-\bm{\epsilon}_{\phi,\psi}(\bm{p}_{t},\bm{o})\|^{2}].

(18)

Once the reverse process is trained, multiple trajectories are sampled by utilizing the predictor to infer from the initial noise. For simplicity and conservatism, it is assumed that the future of each joint is predicted in an independent manner. Consequently, the entire inference and statistical process is encapsulated in the following equation:

\displaystyle\hat{\bm{p}}_{0},\hat{\bm{\Sigma}}_{p}=

\displaystyle f_{a}(\bm{o}),

(19)

where $\hat{\bm{p}}_{0}=\{\hat{\bm{q}}_{h}^{(t)}\in\Re^{5}|t=1,2,\cdots,N_{p}\}$ represents the estimated motion intention, and $\hat{\bm{\Sigma}}_{p}=\{diag(\hat{\bm{\sigma}}_{h}^{(t)})\in\Re^{5\times 5}|t=1,2,\cdots,N_{p}\}$ represents the standard deviation of the prediction.

Algorithm 1 Preemptive Tuning for the Reference Trajectory

Given:

\hat{\bm{p}}_{0},\hat{\bm{\Sigma}}_{p},\bm{q},\varepsilon

\bm{q}_{r}^{(0)}={\rm Saturate}(\hat{\bm{q}}_{h}^{(0)},\bm{q}_{\min},\bm{q}_{\max})

for

t=1

N_{p}

for

i=1

5

\delta_{i}^{(t)}=\min(\|\hat{q}_{hi}^{(t-1)}-q_{\min,i}\|,\|q_{\max,i}-\hat{q}_{hi}^{(t-1)}\|)

\frac{\hat{\sigma}_{hi}^{(t)}}{\hat{\sigma}_{hi}^{(t)}+(\delta_{i}^{(t)})^{2}}\leq\varepsilon

then

q_{ri}^{(t)}=\hat{q}_{hi}^{(t)}

else

q_{ri}^{(t)}=q_{ri}^{(t-1)}

end if

end for

Output:

\bm{q}_{r}^{(0)}

\bm{q}_{r}^{(N_{p})}

Preemptive Tuning: Given the distribution of a predicted future trajectory, it is imperative to preemptively tune the reference trajectory to ensure compliance with established constraints. Specifically, for each joint $i$ , denoted by subscript, the feasible trajectory must hold the following probabilistic inequality:

\displaystyle\mathcal{P}({q}_{hi}^{(t)}\notin\mathcal{Q}_{i})<

\displaystyle\varepsilon,

(20)

where $\mathcal{Q}_{i}$ is the pre-defined permissible motion range $[{q}_{\min,i},{q}_{\max,i}]$ . It is stipulated that $\hat{q}_{hi}^{(0)}=q_{i}$ , where $q_{i}$ is the current position of joint $i$ of the upper-limb exoskeleton robot. In accordance with Cantelli’s inequality, the predicted distribution is characterized by:

	$\displaystyle\mathcal{P}({q}_{hi}^{(t)}\geq\hat{q}_{hi}^{(t)}+\delta_{i}^{(t)})\leq\frac{\hat{\sigma}_{hi}^{(t)}}{\hat{\sigma}_{hi}^{(t)}+(\delta_{i}^{(t)})^{2}},$		(21)
	$\displaystyle\delta_{i}^{(t)}=\min(\\|\hat{q}_{hi}^{(t-1)}-{q}_{\min,i}\\|,\\|{q}_{\max,i}-\hat{q}_{hi}^{(t-1)}\\|).$		(22)

Now, we introduce the following tuning law for prediction:

\displaystyle q_{ri}^{(t)}=\begin{cases}\hat{q}_{hi}^{(t)},&\text{if }\frac{\hat{\sigma}_{hi}^{(t)}}{\hat{\sigma}_{hi}^{(t)}+(\delta_{i}^{(t)})^{2}}\leq\varepsilon,\\ q_{ri}^{(t-1)},&\text{otherwise}.\end{cases}

(23)

Here, $q_{ri}^{(t)}$ is the reference trajectory of joint $i$ at time-step $t$ . It updates to the predicted motion intention $\hat{q}_{hi}^{(t)}$ when the movement trend is within the established boundaries. However, if a potential boundary violation is detected, it maintains trajectory from the previous time-step to prevent the limit being crossed. The aforementioned process is also detailed in Algorithm 1. The adjusted reference trajectory $\bm{q}_{r}$ offers an accurate estimate of the intended motions of the unaffected side of the body, guaranteeing that safety considerations are adequately addressed. By resolving (11), it is feasible to generate a trajectory for the upper-limb exoskeleton robot that is not only smooth and characterized by minimal acceleration but also ensures the safety of the desired future trajectory.

4.2 Passive Training Mode

In cases of significant motor impairment, the upper-limb exoskeleton robot is engaged in passive following training. This mode of training is essential for assisting the patient to follow a pre-defined trajectory. Moreover, to promote rehabilitation, it is crucial to individualize the assistance in the regulation of assist-as-needed. The workflow of passive following training is depicted in Figure 5. In this mode, motion data from the upper limbs of healthy subjects are collected to facilitate the training of two distinct modules.

Anomaly Detector: To individualize assistance, a real-time criterion is needed for evaluating the assistive trajectory. We use an anomaly detector to quantify the comfort of wear and the effectiveness of rehabilitation. This anomaly detector is based on a diffusion model architecture and identifies irregular patterns in upper-limb movements, as illustrated in Figure 7. Subsequently, the anomaly detector computes a score that quantifies the deviation between the current human-robot interaction and a natural interaction condition, thereby guiding the individualization of assistance. In this approach, sensory feedback at a given timestep $i$ , captured through a sliding window mechanism, is fed into the anomaly detector, denoted as $\bm{\mathbf{x}}^{(i)}\in\Re^{L_{s}N_{c}}$ , where $L_{s}$ is the width of the sliding window, and $N_{c}$ is the number of data channels.

As mentioned, the anomaly detector is based on a diffusion model, wherein the reverse diffusion process of duration $T^{p}$ is delineated as the sequence $(\bm{\mathbf{x}}_{T^{p}},\bm{\mathbf{x}}_{T^{p}-1},\cdots,\bm{\mathbf{x}}_{0})$ . The associated diffusion and reverse diffusion processes are defined as follows:

	$\displaystyle q(\bm{\mathbf{x}}_{t}\|\bm{\mathbf{x}}_{t-1})$	$\displaystyle=\mathcal{N}(\bm{\mathbf{x}}_{t};\sqrt{1-\beta_{t}^{p}}\bm{\mathbf{x}}_{t-1},\beta_{t}^{p}\bm{I}),$		(24)
	$\displaystyle q_{\Psi}(\bm{\mathbf{x}}_{t-1}\|\bm{\mathbf{x}}_{t})$	$\displaystyle=\mathcal{N}(\bm{\mathbf{x}}_{t-1};\bm{\mu}_{\Psi}(\bm{\mathbf{x}}_{t}),\tilde{\beta}_{t}^{p}\bm{I}),$		(25)

where $\beta_{1}^{p},\beta_{2}^{p},\cdots,\beta_{T^{p}}^{p}$ are variance schedules utilized to modulate the level of noise injected during the process. The adjusted variance, $\tilde{\beta}_{t}^{p}$ , is calculated using the formula $\tilde{\beta}_{t}^{p}=\frac{1-\bar{\alpha}^{a}_{t-1}}{1-\bar{\alpha}_{t}^{p}}\beta_{t}^{p}$ , and by exploiting the definitions $\alpha_{t}^{p}=1-\beta_{t}^{p}$ and $\bar{\alpha}_{t}^{p}=\prod_{i=1}^{t}\alpha_{i}^{p}$ .

In alignment with the concept outlined in the intention predictor, the loss function used to train the diffusion model in passive following mode is specified as follows

\displaystyle\mathcal{L}_{p}(\Psi)=\mathbb{E}_{t,\bm{\mathbf{x}}_{0},\bm{\epsilon}}[\|\bm{\epsilon}-\bm{\epsilon}_{\Psi}(\bm{\mathbf{x}}_{t})\|^{2}].

(26)

Once the reverse process has been effectively learned from the dataset comprising upper-limb movements, the anomaly detector proficiently filters noise from contaminated sensory data to yield a clarified output. This capability facilitates the generation of a refined sensory input through the anomaly detector. this input is employed to compute the anomaly score, as delineated in Algorithm 2, where $\bm{\epsilon}_{p},\bm{z}_{p}\thicksim\mathcal{N}(\bm{0},\bm{I})$ and $\nu\in[1,T^{p}]$ is a constant parameter. For the sake of brevity, the methodology for calculating the anomaly score is encapsulated by the following function:

\displaystyle s=f(\bm{q},\dot{\bm{q}},\bm{\theta},\dot{\bm{\theta}},\bm{\tau}_{e}).

(27)

Algorithm 2 Anomaly Detection

\bm{q},\dot{\bm{q}},\bm{\theta},\dot{\bm{\theta}},\bm{\tau}_{e}

1: Initialize sliding window queue

2: for each new data point do

\bm{\mathbf{x}}_{now}^{(i)}\leftarrow(\bm{q},\dot{\bm{q}},\bm{\theta},\dot{\bm{\theta}},\bm{\tau}_{e})

4: Enqueue

\mathbf{x}_{now}^{(i)}

to sliding window queue

5: if sliding window queue is full then

\mathbf{x}_{0}\leftarrow\text{GetWindowData}

\bm{\epsilon}_{p}\leftarrow\text{Sample}

\hat{\mathbf{x}}_{\nu}\leftarrow\mathbf{x}_{0}\sqrt{\bar{\alpha}^{p}_{\nu}}+\bm{\epsilon}_{p}\sqrt{1-\bar{\alpha}^{p}_{\nu}}

9: for

t=\nu,\cdots,1

10:

\bm{z}_{p}\leftarrow\text{Sample}()

11:

\hat{\mathbf{x}}_{t-1}\leftarrow\frac{1}{\sqrt{\alpha^{p}_{t}}}(\hat{\mathbf{x}}_{t}-\frac{1-\alpha^{p}_{t}}{\sqrt{1-\bar{\alpha}^{p}_{t}}}\bm{\epsilon}_{\Psi}(\hat{\mathbf{x}}_{t}))+\tilde{\beta}_{t}^{p}\bm{z}_{p}

12: end for

13:

s\leftarrow{||\mathbf{x}_{0}-\hat{\mathbf{x}}_{0}||}^{2}

14: Output:

s

15: Dequeue from sliding window queue

16: end if

17: end for

The anomaly detector integrates diffusion models and thus is adept at capturing the inherent spatiotemporal patterns and stochastic motion tendencies of upper-limb movements through analysis of sensor data. Consequently, anomaly scores can be computed in real time for human–robot interactions. These scores serve as indicators of the comfort levels and the naturalness of the assistance provided by the upper-limb exoskeleton robot.

Reference Generation: To customize assistance during passive following training, historical upper-limb trajectories, coupled with online trajectory refinement as outlined in (11) are integrated to develop a probabilistic model. In this context, ProMPs are employed to encode a set of trajectories into a probabilistic framework (Paraschos et al., 2013), which is capable of generating similar references through sampling. The application of ProMPs for trajectory sampling is particularly suited for analyzing the repetitive movements encountered in passive following training. This suitability is attributable to the probabilistic model’s effective accommodation of sensor noise, human uncertainty, and individual biases.

To implement the ProMPs, we express the trajectory by means of the weight vector $\bm{\omega}\in\Re^{Dn\times 1}$ , where $D$ is the number of basis functions, and $n$ is the number of active joints, such that

$\displaystyle\bm{y}_{t}$	$\displaystyle=\left[\begin{array}[]{ccc}\bm{q}_{1,t}^{\mathsf{T}}&\cdots&\bm{q}_{n,t}^{\mathsf{T}}\end{array}\right]^{\mathsf{T}}=\begin{bmatrix}\bm{\Phi}_{t}\\ \dot{\bm{\Phi}}_{t}\end{bmatrix}\bm{\omega}+\bm{\epsilon}_{y},$	(29)
$\displaystyle\bm{q}_{i,t}$	$\displaystyle=\left[\begin{array}[]{cc}q_{i,t}&\dot{q}_{i,t}\end{array}\right]^{\mathsf{T}},$	(31)
$\displaystyle p(\bm{\tau}_{y}\|\bm{w})$	$\displaystyle=\prod_{t}\mathcal{N}(\bm{y}_{t}\|\bm{\Phi}_{t}\bm{\omega},\bm{\Sigma}_{y}),$	(32)

where $\bm{q}_{i,t}\in\Re^{2}$ represents the composite vector of the $i^{th}$ joint at time step $t$ , $\bm{\epsilon}_{y}\sim\mathcal{N}(\bm{0},\bm{\Sigma}_{y})$ represents zero-mean i.i.d. Gaussian noise, $\bm{\tau}_{y}$ is the trajectory over the demonstration, and $\bm{\Phi}_{t}\in\Re^{n\times Dn}$ , chosen as a Gaussian form, is the time-variant basis matrix.

Given the assumption that $\bm{\omega}$ follows a normal distribution, $\bm{\omega}\sim\mathcal{N}(\bm{\omega}|\bm{\mu}_{\omega}^{(k)},\bm{\Sigma}_{\omega}^{(k)})$ , a new trajectory at time step $t$ can be modeled as follows:

\displaystyle p(\bm{y}_{t};\bm{\mu}_{\omega}^{(k)},\bm{\Sigma}_{\omega}^{(k)})=\int\mathcal{N}(\bm{y}_{t}|\bm{\Phi}_{t}\bm{\omega},\bm{\Sigma}_{y})\mathcal{N}(\bm{\omega}|\bm{\mu}_{\omega}^{(k)},\bm{\Sigma}_{\omega}^{(k)})d\bm{\omega}.

(33)

Therefore, the reference trajectory is given as follows:

\displaystyle\bm{q}_{r}(t)=[q_{i,t},\cdots,q_{n,t},]

(34)

To facilitate the generation of personalized assistance, the following cost function is introduced to evaluate the performance of the provided reference trajectory:

\displaystyle\mathcal{S}(\bm{q}_{r})=\int_{T_{r}}\{\|\bm{q}_{d}(\bm{q}_{r})-\bm{q}\|_{\bm{Q}}^{2}+s^{2}\}dt,

(35)

where $T_{r}$ signifies the duration of the assistance, and $\bm{q}_{d}$ denotes the desired trajectory obtained from online refinement via (11).

Given that each sampled trajectory can be evaluated by its cost, it becomes feasible to attribute an information-theoretic weight to them, reflecting the performance across all $k$ trajectories (Williams et al., 2017):

\displaystyle w^{(i)}=\frac{1}{\eta_{k}}\exp(-\frac{1}{\lambda_{p}}\mathcal{S}(\bm{q}_{r}^{(i)})),

(36)

where $\bm{q}_{r}^{(i)}$ is the reference trajectory from the $i^{th}$ sampling, $\eta_{k}$ is a normalization constant, and $\lambda_{p}$ is a small positive constant. A detailed analysis of the adopted weight setting is included in the Appendix.

Considering the performance of the trajectory, the parameters $\bm{\mu}_{\omega}^{(k)}$ and $\bm{\Sigma}_{\omega}^{(k)}$ are deduced from the $k$ collected trajectories, as follows:

$\displaystyle\bm{\omega}^{(i)}$	$\displaystyle=(\bm{\Phi}^{\mathsf{T}}\bm{\Phi})^{-1}\bm{\Phi}^{\mathsf{T}}\bm{Y}^{(i)},$	(37)
$\displaystyle\bm{\mu}_{\omega}^{(k)}$	$\displaystyle=\sum_{i}w^{(i)}\bm{\omega}^{(i)},$	(38)
$\displaystyle\bm{\Sigma}_{\omega}^{(k)}$	$\displaystyle=\sum_{i}w^{(i)}(\bm{\omega}^{(i)}-\bm{\mu}_{\omega}^{(k)})(\bm{\omega}^{(i)}-\bm{\mu}_{\omega}^{(k)})^{\mathsf{T}},$	(39)

where $\bm{\Phi}\in\Re^{N_{r}n\times Dn}$ is a matrix comprised of block diagonal matrices $\bm{\Phi}_{t}$ , stacked vertically in accordance with sampling number $N_{r}$ , and $\bm{Y}^{(i)}\in\Re^{N_{r}n}$ corresponds to the $i^{th}$ gathered trajectory. This process facilitates the construction of the probabilistic model.

The adjusted trajectory is integrated into the upper-limb exoskeleton robot to assist the patient. Subsequently, the patient’s actual movement during the training sessions is captured and utilized to iteratively refine the probabilistic model (33). Specifically, $\bm{\mu}_{\omega}^{(k)}\leftarrow\bm{\mu}_{\omega}^{(k+1)}$ and $\bm{\Sigma}_{\omega}^{(k)}\leftarrow\bm{\Sigma}_{\omega}^{(k+1)}$ , which is then used to facilitate the planning of subsequent trajectories, as illustrated in Algorithm 3. $N_{s}$ is the number of times that free exploration is performed based on the pre-collected demonstrations. The refinement of assistance is governed by the anomaly score, reflecting prior healthy movement behavior. This score serves as an indicator of movement comfort and naturalness during rehabilitation, ensuring that the trajectory adjustments intrinsically enhance the quality of assistance.

The optimal assistance distribution is identified via a coarse-to-fine approach. Initially, the sample distribution is established based on pre-defined demonstrations, which primarily guide the exploratory phase using the initial samples. Subsequent iterations improve the assistance distribution by progressively narrowing the sampling space and utilizing previously explored optimal trajectory values. The probability model focuses on the best-performing trajectories to provide optimal assistance, which effectively addresses the various uncertainties that the patient may encounter during the task. Incorporation of the patient’s motion data ensures that subsequent trajectories become increasingly individualized and thus align increasingly more with the patient’s medical needs. Hence, this method enables the upper-limb exoskeleton robot to exploit online interactions for the dual purpose of enhancing interaction safety and improving the efficacy of passive following training.

Algorithm 3 Reference Trajectory Generation

\bm{Y}^{(k+1)}

1: if

k\leq N_{s}

then

2: Initialize

\bm{\mu}_{\omega}^{(k+1)},\bm{\Sigma}_{\omega}^{(k+1)}

with pre-collected data.

3: else

4: Calculate cost

\mathcal{S}(\bm{q}_{r}^{(k+1)})

using Eq(35)

5: for all collected

k+1

trajectories do

6: Calculate cost

w^{(i)},\bm{\omega}^{(i)}

using Eq(36) and Eq(37)

7: end for

8: Calculate distribution

\mathcal{N}(\bm{\omega}|\bm{\mu}_{\omega}^{(k+1)},\bm{\Sigma}_{\omega}^{(k+1)})

using Eq(38) and Eq(39)

9: end if

10:

\bm{\omega}\leftarrow\text{Sample}(\mathcal{N}(\bm{\omega}|\bm{\mu}_{\omega}^{(k+1)},\bm{\Sigma}_{\omega}^{(k+1)}))

11:

\bm{q}_{r}(t)\leftarrow\bm{\Phi}_{t}\bm{\omega}

5 Interaction Control

The joint below the shoulder joints is cable-driven, which decreases the inertia associated with movement and thus increases comfort. Given the presence of these joints, there is substantial friction, which can hinder the movement of joints and therefore should be considerably compensated for.

To achieve this, the upper-limb exoskeleton robot first performs movements in the absence of the patient, where $\bm{\tau}_{e}=\bm{0}$ . In addition, the disturbance torque can be parameterized as follows (De Wit et al., 1995):

	$\displaystyle\bm{\tau}_{f}$	$\displaystyle=(\bm{a}_{f}+\bm{b}_{f}\odot e^{-\bm{c}_{f}\odot\dot{\bm{q}}}+\bm{d}_{f}\odot\dot{\bm{q}})\odot\bm{sgn}(\dot{\bm{q}})$
		$\displaystyle\approx(\bar{\bm{a}}_{f}+\bar{\bm{b}}_{f}\odot\dot{\bm{q}}+\bar{\bm{c}}_{f}\odot\dot{\bm{q}}\odot\dot{\bm{q}})\odot\bm{sgn}(\dot{\bm{q}})=\bm{Y}(\dot{\bm{q}})\bm{\zeta},$		(40)

where $\bm{a}_{f},\bm{b}_{f},\bm{c}_{f},\bm{d}_{f}$ are the unknown parameters, $\bar{\bm{a}}_{f},\bar{\bm{b}}_{f},\bar{\bm{c}}_{f}$ are derived from the Taylor expansion as simplifications for the model, $\odot$ denotes the Kronecker product, $\bm{sgn}(\cdot)$ represents a sign function, $\bm{Y}(\cdot)$ represents a regressor matrix, and $\bm{\zeta}$ represents the vector of model parameters. The approximation presented in (40) is reasonable because the velocities of the joints of the upper-limb exoskeleton robot remain rather low during a given rehabilitation process.

As the upper-limb exoskeleton robot is equipped with force sensors on all of its active joints, friction can be directly measured and recorded in the absence of a patient. Together with the recorded joints’ velocities, the parameters of friction model are learned via polynomial fitting. The estimated friction is represented as follows

\displaystyle\hat{\bm{\tau}}_{f}

\displaystyle=\bm{Y}(\dot{\bm{q}})\hat{\bm{\zeta}}.

(41)

Once the friction is estimated, a variable impedance model is proposed. This model must be capable of identifying the human–robot interaction condition and addressing the conflict during rehabilitation. These capabilities enable the model to regulate the action of the upper-limb exoskeleton robot.

To regulate the impedance model, a weighting function (Zhang et al., 2023) is introduced to consider the anomaly score. This function is mathematically defined as follows:

\displaystyle w(s)=\lambda_{1}\tanh(-\frac{s}{\chi_{1}}+\chi_{2})+\lambda_{2},

(42)

where $\lambda_{1}$ and $\lambda_{2}$ are positive constants that determine the range and median of the weighting function, respectively, $\chi_{1}$ is a constant that normalizes the anomaly score into a specified small range, and $\chi_{2}$ is the offset of the weighting function from the origin of the coordinates along the positive horizontal axis. Based on this function, the desired impedance model is redefined as follows:

\displaystyle\bm{C}_{d}(\dot{\bm{q}}-\dot{\bm{q}}_{d})+\bm{K}_{d}(\bm{q}-\bm{q}_{d})=\frac{1}{w(s)}\bm{\tau}_{e}.

(43)

Multiplying both sides of (43) by $w(s)$ yields

\displaystyle\bm{C}_{a}(t)(\dot{\bm{q}}-\dot{\bm{q}}_{d})+\bm{K}_{a}(t)(\bm{q}-\bm{q}_{d})=\bm{\tau}_{e},

(44)

where $\bm{C}_{a}(t)\stackrel{{\scriptstyle\triangle}}{{=}}w(s)\bm{C}_{d}$ and $\bm{K}_{a}(t)\stackrel{{\scriptstyle\triangle}}{{=}}w(s)\bm{K}_{d}$ , are the time-varying apparent impedance parameters. These parameters are utilized to deduce the relationship between the desired trajectory and the interaction torque, as outlined in (11). The function of the mechanism is such that an increase in the anomaly score leads to a decrease in $w(s)$ , thereby reducing the magnitude of the impedance parameters. This reduction ensures that there is an increase in the passiveness with which the upper-limb exoskeleton robot responds to any detected conflicts. Conversely, when the anomaly score is low, the impedance parameters revert to their original values, thereby maintaining the level of assistance provided by the upper-limb exoskeleton robot.

Subsequently, an impedance vector is introduced, as follows

	$\displaystyle\bm{z}$	$\displaystyle=\dot{\bm{q}}-\dot{\bm{q}}_{r}$
		$\displaystyle=\dot{\bm{q}}-\dot{\bm{q}}_{d}+\bm{C}_{d}^{-1}\bm{K}_{d}(\bm{q}-\bm{q}_{d})-\frac{1}{w(s)}\bm{C}_{d}^{-1}\bm{\tau}_{e},$		(45)

where

\displaystyle\dot{\bm{q}}_{r}=\dot{\bm{q}}_{d}-\bm{C}_{d}^{-1}\bm{K}_{d}(\bm{q}-\bm{q}_{d})+\frac{1}{w(s)}\bm{C}_{d}^{-1}\bm{\tau}_{e}

(46)

is a reference vector. According to (45), the convergence of $\bm{z}\rightarrow\bm{0}$ implies the realization of the desired impedance model (43).

The overall control input is designed as defined in (3), with the fast time-scale control term defined as in (4). Next, the slow time-scale control term is established by using the estimated friction $\hat{\bm{\tau}}_{f}$ to stabilize the dynamics expressed in (9) and achieve the desired impedance model, as follows:

	$\displaystyle\bm{u}_{s}=$	$\displaystyle-\bm{K}_{z}\bm{z}-\bm{S}_{2}^{\mathsf{T}}\hat{\bm{\tau}}_{f}-{\bm{\tau}}_{e}-k_{g}\cdot{\bm{sgn}}(\bm{z})$
		$\displaystyle+(\bm{M}(\bm{q})+\bar{\bm{B}})\ddot{\bm{q}}_{r}+\bm{C}(\dot{\bm{q}},\bm{q})\dot{\bm{q}}_{r}+\bm{g}(\bm{q}),$		(47)

where ${\bm{sgn}}(\cdot)$ is the sign function and is defined as follows:

\displaystyle{\rm sgn}(z)=\left\{\begin{array}[]{*{20}{l}}1,&&z>0\\ 0,&&z=0\\ -1,&&z<0\end{array}\right.

(51)

where $k_{g}$ is a positive constant, and $\bm{K}_{z}\in\Re^{n\times n}$ is a diagonal and positive-definite matrix.

The proposed variable impedance controller, as delineated in (3), (4), and (47), can be demonstrated to be exponentially stable, as shown by the stability analysis provided in the Appendix.

6 Experiments

The proposed dual-mode trajectory refinement method was implemented within the upper-limb exoskeleton robot to assess the effectiveness of both rehabilitation modes. Figure 8 depicts the experimental configuration, in which the main board controlled the impedance of the upper-limb exoskeleton robot. This board was interfaced via a serial port connection with a PC, which was outfitted with an Intel i5-13490F CPU and an RTX 4060Ti graphics card. The PC executed the anomaly detection module, calculated the reference trajectory $\bm{q}_{r}$ , and dynamically updated the desired trajectory $\bm{q}_{d}$ online. Subjects performed rehabilitation training in passive following mode or active mirroring mode. In the latter mode, the subject wore a brace equipped with markers on the unaffected side of the body to facilitate engagement in rehabilitation exercises. Optical motion capture equipment (Nokov) was used to accurately capture motion intentions.

Implementation of the variable impedance controller required knowledge of the dynamic model. Thus, the dynamic parameters were computed analytically in real time using the open-source Orocos Kinematics and Dynamics Library¹¹1https://www.orocos.org/wiki/orocos/kdl-wiki.html. In this computation, the upper-limb exoskeleton robot is deconstructed into a sequence of links and joints to formulate a model defining its physical configuration, encompassing characteristics such as the length, mass, and inertia of each link. Subsequently, the forward dynamics are derived using Newton–Euler equations based on this model.

The intention predictor and the anomaly detection module required motion data, which was collected from the upper-limb exoskeleton robot during operation by healthy subjects. To facilitate free and natural movement, the upper-limb exoskeleton robot was set to operate in a transparent mode during the data collection phase (Chen et al., 2023a). Specifically, the controller was designed as follows (Zimmermann et al., 2020):

$\displaystyle\bm{u}_{0}=$	$\displaystyle(\bm{M}(\bm{q})+\bar{\bm{B}})\ddot{\bm{q}}_{0}+\bm{C}(\dot{\bm{q}},\bm{q})\dot{\bm{q}}$
	$\displaystyle+\bm{g}(\bm{q})-\bm{S}_{2}^{\mathsf{T}}\hat{\bm{\tau}}_{f}-{\bm{\tau}}_{e}+\bm{u}_{f},$	(52)
$\displaystyle\ddot{\bm{q}}_{0}=$	$\displaystyle\frac{1}{\gamma_{0}}\bm{(}\bm{M}(\bm{q})+\bar{\bm{B}})^{-1}{\bm{\tau}}_{e},$	(53)

where $\gamma_{0}$ represents a parameter controlling the magnitude of virtual mass, and $\ddot{\bm{q}}_{0}$ denotes the desired acceleration. In this transparent mode, healthy subjects or patients were able to maneuver the upper-limb exoskeleton robot effortlessly, without experiencing significant discomfort.

The next section presents the results of five experiments, which are described below.

-

Intention Predictor: This experiment aimed to validate the efficacy and accuracy of the proposed intention predictor. A diverse set of intention prediction methods were trained using the collected data, and then a comparative analysis was performed to demonstrate the superiority of the proposed intention predictor.
-

Anomaly Detection: This experiment aimed to evaluate the performance of the anomaly detection network. Its detection accuracy was demonstrated in various simulated anomaly scenarios, including movements outside the normal range, stroke-induced convulsions, and human–robot interaction conflicts. Furthermore, comparative studies were conducted to illustrate that the proposed anomaly detector, which is based on a diffusion model, exhibits detection accuracy that is significantly better than that exhibited by other anomaly detection methods.
-

Interaction Control: This experiment aimed to assess the dynamic capabilities of the system and its ability to reject disturbances. Thus, a trajectory tracking task was conducted. In addition, the efficacy of the proposed variable impedance controller was validated in a scenario involving anomalies.
-

Active Mirroring Training: This experiment aimed to assess the effectiveness of the online trajectory refinement. Thus, ablation studies were conducted. In addition, how the anomaly score influences the trajectories generated by this refinement process was examined. Furthermore, an active mirroring mode was implemented, and the motion capture system was used to verify that the proposed method effectively constrains assistive trajectories and maintains safety throughout a rehabilitation process.
-

Passive Following Training: This experiment aimed to validate the improvements in rehabilitation facilitated by passive following training. An ablation study was conducted to demonstrate that the online trajectory refinement significantly enhances movement naturalness and task performance throughout the training process. Moreover, a clinical trial was performed with stroke patients to obtain evidence that this rehabilitation framework significantly aids in the recovery of motor functions.

7 Results

Two able-bodied participants with no prior experience with upper-limb exoskeleton robots were recruited for motion data collection. Table 3 presents the motion range of the upper-limb exoskeleton robot during data collection. This motion range covers the main spatial areas of upper-limb daily activities.

Table 3: Motion Range of Joints

Joint	1	2	3	4	5
Min(^∘)	-90	-45	-30	-5	-30
Max(^∘)	30	115	30	120	30

Table 4 presents an overview of the participants’ statistical attributes. The participants signed a written informed consent form prior to the experimental sessions. Next, the participants underwent a 6-minute pre-training phase to familiarize themselves with moving while wearing the upper-limb exoskeleton robot. During the data collection phase, no specific movement guidelines were imposed. Therefore, the participants were permitted to move the upper-limb exoskeleton robot at their preferred speed, i.e., rapidly or slowly, or to maintain it in a stationary position. The collected data were compiled into a dataset that was utilized for training the intention predictor and anomaly detection modules.

Table 4: Statistical information of the subjects

Subject	Gender	Age(y)	Weight(kg)	Height(cm)
1	Male	30	87	181
2	Male	24	70	172

7.1 Intention Predictor

To achieve real-time estimation of human motion intention, the collected motion data were utilized to train the intention predictor. To enhance the generalizability of the trained model, we partitioned the collected motion data into training, testing, and validation sets in an 8:2:1 ratio during the training process. We employed Trajectron++ (Salzmann et al., 2020) as the backbone for extracting trajectory features and selected a transformer architecture to manage the diffusion process. The parameters for the intention predictor were set as follows: $N_{o}=5$ , $N_{p}=7$ , and $T=100$ . Denoising diffusion implicit models (DDIMs) (Song et al., 2020) were employed in the inference process. Intention prediction relies on historical observation data, and the predicted trajectory length was rather short. Therefore, we selected forward integration and a convolutional neural network–long short-term memory (CNN-LSTM)-based neural network that previously demonstrated excellent performance in estimating robot joint behavior (Kim and Cho, 2019) as the baselines for the experiments.

Table 5: Prediction Results for Different Methods

Method	FDE(^∘)	ADE(^∘)	MAE(^∘)	RMSE(^∘)
Forward-int.	0.171	0.102	0.055	0.066
CNN-LSTM	0.165 $\uparrow$	0.099 $\uparrow$	0.056 $\downarrow$	0.067 $\downarrow$
Ours	0.096 $\uparrow$	0.061 $\uparrow$	0.033 $\uparrow$	0.040 $\uparrow$

Furthermore, we adopted final displacement error (FDE), average displacement error (ADE), mean absolute error (MAE), and root-mean-square error (RMSE) as the evaluation metrics to measure the accuracy of the predictions. FDE and ADE were computed on a two-dimensional plane defined by Joint 2 and Joint 4, which possess the largest range of motion and exhibit the most frequent movements. MAE and RMSE were averaged across all active joints. The predictions for the validation set are presented in Table 5. These results suggest that in the considered real-time intention prediction task, the trajectory prediction performance of the CNN-LSTM-network-based method was on a par with that of the forward integration method. However, our intention predictor based on a diffusion model significantly outperformed the forward integration method. Specifically, our intention predictor showed a 43.9 $\%$ lower FDE, a 40.2 $\%$ lower ADE, a 40.0 $\%$ lower MAE, and a 39.4 $\%$ lower RMSE than the forward integration method.

A comparative analysis was conducted to validate the applicability of our intention predictor across different movement conditions in four tasks. Two tasks were based on motion data that were collected while the participants were wearing the upper-limb exoskeleton robot, including data on free movements similar to the training data and arm-swinging sinusoidal movements. The other two tasks were based on simulated smooth trajectories, namely circular and lemniscate trajectories. Not all tested trajectories were included in the pre-collected dataset employed in the training phase. Table 6 reports the performance of the intention predictor in the four tasks, averaged across five active joints. It can be seen that our intention predictor exhibits the best performance in all metrics.

Table 6: Prediction Results for Different Tasks

Task	Method	FDE(^∘)	ADE(^∘)	MAE(^∘)	RMSE(^∘)
Free move	Forward integration	0.209	0.125	0.076	0.090
	CNN-LSTM	0.163	0.098	0.060	0.073
	Ours	0.111 (46.9 $\%\uparrow$ )	0.071 (43.2 $\%\uparrow$ )	0.042 (44.7 $\%\uparrow$ )	0.052 (42.2 $\%\uparrow$ )
Sinusoidal	Forward integration	0.731	0.379	0.243	0.297
	CNN-LSTM	0.739	0.387	0.248	0.302
	Ours	0.616 (15.7 $\%\uparrow$ )	0.331 (12.7 $\%\uparrow$ )	0.212 (12.8 $\%\uparrow$ )	0.258 (13.2 $\%\uparrow$ )
Circular	Forward integration	0.078	0.037	0.024	0.029
	CNN-LSTM	0.101	0.054	0.031	0.038
	Ours	0.066 (15.4 $\%\uparrow$ )	0.029 (21.6 $\%\uparrow$ )	0.018 (25.0 $\%\uparrow$ )	0.024 (17.2 $\%\uparrow$ )
Lemniscate	Forward integration	0.111	0.049	0.032	0.041
	CNN-LSTM	0.095	0.045	0.028	0.036
	Ours	0.086 (22.5 $\%\uparrow$ )	0.035 (28.6 $\%\uparrow$ )	0.022 (31.3 $\%\uparrow$ )	0.029 (29.3 $\%\uparrow$ )

Figure 9 illustrates trajectories from steps 1, 4, and 7 of trajectory prediction, together with the actual reference trajectory. For clarity, we present the experimental results in a two-dimensional plane in which Joint 2 is the x-axis and Joint 4 is the y-axis. Moreover, the right side of each subplot shows an expanded view of the trajectory prediction details within the black box. It can be seen that the trajectory prediction at step 1 aligns well with the actual trajectory, while that the trajectory predictions at steps 4 and 7 deviate slightly from the real trajectory. These results demonstrate that as the number of prediction steps increase, the prediction accuracy slightly decreases and corresponding prediction variance increases. This trend is explicitly considered in the preemptive tuning algorithm. Overall, the above-mentioned results confirm that the our intention predictor reliably forecasts upper-limb joint movements and certain regular motions by capturing the dynamic trends of trajectories based on historical observations. The free-move task was directly related to human movements in practical applications, as it simulated real-life scenarios. In contrast, the sinusoidal, circular, and lemniscate tasks were designed to test the generalization performance of our intention predictor in contexts other than rehabilitation training.

7.2 Anomaly Detection

The collected motion data, inclusive of interaction information, were integrated to train the our anomaly detector, which operated in real time to evaluate safety and the naturalness of motion. Therefore, these data served as a repository of safe and natural interaction feedback, enabling the anomaly detector to discern their latent relationship and subsequently identify abnormal interactions during human–robot interaction. The parameters were $L_{s}=100$ and $N_{c}=21$ , denoting the history of joint and motor motion data, in addition to interaction torque. The diffusion model parameters during both training and inference phases were $T^{p}=100$ and $\nu=60$ . DDIMs were also employed during inference.

During the experiment, a participant was required to move while wearing the upper-limb exoskeleton robot to simulate various anomalies. Three abnormal scenarios were considered: an excessive movement scenario (involving joint movements beyond the normal range), a balance deviation scenario (involving deviations from the relative balance position), and a simulated stroke tremors scenario. In the excessive movement scenario, the participant initially maintained the upper-limb exoskeleton robot within the normal movement range and then lowered the arm to simulate the anomaly. As the arm was gradually lowered, there was a decrease in the shoulder joint angle that progressively crossed the motion boundary and thus the anomaly level increased gradually. Subsequently, the arm was raised back to the normal motion range. The results are presented in Figure 10 and reveal that as the joint positions gradually approached and then crossed the boundary, the anomaly score increased. These results demonstrate that the our anomaly detector effectively detected deviations from the normal motion range (i.e., the motion range of the collected data).

In the simulated stroke tremors scenario, the participant initially maintained the upper-limb exoskeleton robot in the rest phase. Subsequently, the participant shook the entire arm of the upper-limb exoskeleton robot to simulate tremors. The corresponding anomaly score and joint positions are depicted in Figure 11. It is evident that during the simulated stroke tremors, the anomaly score generally increased, and while the tremors persisted, it remained rather high.

In the balance deviation scenario, the participant initially maintained the upper-limb exoskeleton robot in a static equilibrium position that represented the balance state in which the upper-limb exoskeleton robot offered assistance. The participant was then instructed to manipulate the upper-limb exoskeleton robot upward and downward to simulate misalignment during assistance. The results are presented in Figure 12 and reveal that there was a marked increase in anomaly scores whenever deviations occurred, regardless of the direction (i.e., regardless of whether the patient’s movement trajectory was above or below the predetermined trajectory). This increase in scores indicates that the above-mentioned anomalies were detected.

Table 7: Comparison of AUCs of the Two Anomaly Detection Methods

Method	VAE	Ours
AUC	0.865	0.999

The performance of the our anomaly detector was experimentally assessed using a VAE-based anomaly detector (Zhang et al., 2023) as a baseline. Specifically, detection performance in the stroke tremor scenario was evaluated using a receiver operating characteristic (ROC) curve, which illustrates the ability of a classification model to differentiate between classes. The areas under the ROC curve for the considered models are presented in Table 7, and the calculated anomaly scores are depicted in Figure 13. In the figure, it can be seen that the scores generated by the proposed anomaly detector markedly increased as the anomaly occurred. Furthermore, the scores generated by the VAE-based anomaly detector failed to return to a normal level once the anomaly ceased. These results indicate that compared with the VAE-based anomaly detector, our anomaly detector is more adaptive in classifying anomalies in different joint configurations, owing to the superior generative performance of the diffusion model.

7.3 Interaction Control

During rehabilitation training, the upper-limb exoskeleton robot is required to regulate human-robot interaction in a desired impedance model with consideration of friction compensation. To this end, the upper-limb exoskeleton robot was operated to move slowly in the absence of a participant and friction was quantified. That is, the difference between the readings from the potentiometers at the motor output and the measurements from the torque sensor at the joint end were calculated. In addition, joint velocities were recorded. Subsequently, polynomial fitting was applied based on the simplified friction model (40) to afford the estimated values for friction parameters presented in Table 8.

Table 8: Estimated Friction Model

	$\bar{a}_{f}$	$\bar{b}_{f}$	$\bar{c}_{f}$
Joint 3	0.822	-2.132	0.557
Joint 4	1.718	11.441	-3.028
Joint 5	0.636	-1.212	1.005

Next, we implemented our impedance controller in the upper-limb exoskeleton robot, devoid of a patient, to track a pre-defined sinusoidal trajectory involving all active joints. This experiment aimed to validate the accuracy of the friction compensation and evaluate the dynamic performance of the controller. The impedance parameters were set as follows: $\bm{C}_{d}=10\bm{I}_{5},\bm{K}_{d}=50\bm{I}_{5}$ where $\bm{I}_{5}$ is a $5\times 5$ identity matrix. The parameters of the weighting function were set as follows: $\lambda_{1}=0.5,\chi_{1}=0.04,\chi_{2}=8.75$ and $\lambda_{2}=1.5$ . The control parameters were set as follows: $\bm{K}_{v}=1.1\bm{I}_{3}$ and $\bm{K}_{z}=diag(1.5,0.6,0.7,4,1.8)$ . The experimental results demonstrate that the impedance controller effectively compensated for friction, enabling the upper-limb exoskeleton robot to accurately follow the desired trajectory, as illustrated in Figure 14. Specifically, the RMSEs for the joints during this trajectory tracking task were $0.613^{\circ}$ for Joint 1, $0.728^{\circ}$ for Joint 2, $0.997^{\circ}$ for Joint 3, $2.143^{\circ}$ for Joint 4, and $0.948^{\circ}$ for Joint 5.

Additionally, we assessed the performance of our impedance controller under conditions involving anomalies. This was achieved by having the upper-limb exoskeleton robot guide the patient in trajectory tracking while the patient held the upper-limb exoskeleton robot in positions that simulated anomalies. The results of this experiment are depicted in Figure 15. It can be seen that throughout the experiment, the impedance vector remained close to zero, indicating that the desired variable impedance model was effectively maintained despite human involvement and the occurrence of anomalies.

7.4 Active Mirroring Training

In active mirroring training, the movements of the upper-limb exoskeleton robot were aligned with the motion intentions of the unaffected side of the body of the patient (as illustrated in Figure 8). In addition, online trajectory refinement was used to smoothen movement commands and enhance safety through dynamic constraints. The details of the experiments are given below.

-

To evaluate the capability of the proposed method to handle unexpected external impact, we conducted an experiment on Joint 2 of the upper-limb exoskeleton robot’s shoulder.
-

We simulated sudden external disturbances by abruptly altering the desired position during the experiments.
-

To represent the lower constraint of the online trajectory refinement, we imposed a lower bound on the position command at $20^{\circ}$ , thereby simulating a tendency to exceed the acceptable movement range.
-

During comparative trials, we eliminated motion command inconsistencies caused by feedback from the unaffected side of the body by deactivating the positional feedback on this side. Therefore, we instead relied on the proprioceptive sensors of the upper-limb exoskeleton robot (specifically, its encoders) and conducted the experiment without a patient.

The position commands and joint positions with and without the implementation of online trajectory refinement, respectively, are illustrated in Figure 16. This figure reveals that without online trajectory refinement, the joint position surpassed the set movement boundary. However, when online trajectory refinement was activated, the joint movement responded effectively to the dynamic constraints and remained close to the boundary. Specifically, without trajectory refinement, the movement exceeded the established boundary by approximately $9^{\circ}$ . In contrast, with trajectory refinement, the excess was significantly reduced to just $0.5^{\circ}$ . The trajectory refinement is designed to adjust the trajectory to adhere to constraints and to mitigate the extent to which joint position violates movement boundaries. Therefore, even with trajectory refinement, the joint position may slightly cross the boundary, as observed with the $0.5^{\circ}$ transgression in this experiment.

Next, we experimented with Joint 1 to verify that the proposed method identified anomaly regions and guided the planned trajectory toward safer areas with lower anomaly scores than the current area. Ideally, this joint’s normal operational range should not significantly exceed zero. In particular, the shoulder adduction angle should remain small during typical activities. Thus, as a motion capture system may yield inaccurate adduction angle estimates due to marker obstruction, we instead utilized the encoder feedback from the upper-limb exoskeleton robot to obtain adduction angle estimates. The results are displayed in Figure 17. In the figure, it can be seen that as the shoulder adduction angle increased, the motion commands generated by the online trajectory refinement decreased the adduction angle of movements rather than ensuring that they strictly adhered to the target trajectory. Moreover, when the joint angle returned to the normal range, i.e., when the shoulder moved into the abduction space, the trajectory refinement resumed its focus on aligning the trajectory with the target trajectory within the dynamic constraints. These results confirm that the proposed method is capable of tracking changes in human motion intention within the normal activity range and refining the trajectory accordingly.

A motion capture system must be installed in a hospital for deploying active mirroring training in clinical trials, and such a system introduces additional limitations and inconvenience. Thus, we did not conduct clinical trials of active mirroring training. Instead, we evaluated our method by using the upper-limb exoskeleton for active mirroring training in a healthy subject by mapping the motion of the left arm (i.e., the healthy side) to that of the right arm (i.e., the mock stroke side). Acquiring precise angles of human upper limbs through an optical motion capture system requires a custom-made suit, which was beyond the scope of this study. Instead, we employed a self-manufactured brace embedded with key markers and gloves fitted with markers to estimate the angles at three limb joints, specifically those corresponding to Joints 1, 2, and 4. However, due to brace deformation and marker obstruction, the accuracy of the upper-limb joint angles was limited to within a certain range. Thus, to ensure safety and demonstrate the capability of our method to impose constraints on upper-limb movements, we set position constraints for Joints 1, 2, and 4 as $[-40^{\circ},10^{\circ}]$ , $[-10^{\circ},80^{\circ}]$ , and $[0^{\circ},60^{\circ}]$ , respectively.

As shown in Figure 18, the desired trajectory was consistently aligned with the trajectory of the human limb $\bm{q}_{h}$ and was refined according to dynamic constraints. Moreover, when the unaffected side of the body moved rapidly to a position outside the established movement boundary, the proposed trajectory refinement was swiftly adjusted. That is, a predetermined maximum speed was implemented to prevent the set joint position boundaries being exceeded. When the movements of the joints on the unaffected side of the body remained within these dynamic constraints, the upper-limb exoskeleton robot tracked the trajectory of the healthy limb. Simultaneously and throughout the training session, our variable impedance controller effectively sustained the human-robot interaction within the desired impedance model.

7.5 Passive Following Training

We conducted a series of experiments to validate the efficiency of the proposed individualization framework on the passive following training task. These experiments were centered on a typical task in upper-limb rehabilitation: raising the upper limb to a fixed point. This task requires coordination between the shoulder and elbow, specifically involving Joints 1, 2, and 4. During the cost calculation, the anomaly score was scaled to have the same magnitude as the tracking error, and the hyperparameter $\lambda_{p}$ was set to $0.003$ . Moreover, as the exploration part of assistance individualization largely consists of initial trajectory sampling based on a demonstration, we set $N_{s}=40$ . The training process was designed to stop when the RMSE of the mean trajectory of the distribution between two consecutive iterations decreased to less than $0.1^{\circ}$ or when the maximum number of iterations was reached. Furthermore, the sampling space was greatly reduced by the demonstration data. The number of samples set was considered to be sufficient for exploration in the current task based on practical experience.

First, we conducted ablation studies with a healthy patient present to validate the efficacy of the online refinement module in passive following training scenarios. In addition, five restarts were executed to bypass local optima. The results of these studies are depicted in Figure 19. Within the figure, the black dashed line illustrates the trajectory that achieved the lowest cost during the exploration phase, while the red solid line represents the mean of the assistive trajectory distribution. Without online trajectory refinement, the experiment spanned 67 iterations, whereas with online trajectory refinement, the experiment spanned 55 iterations. The experimental results reveal that without online refinement in passive following training individualization, the converged assistive trajectory distribution exhibited a large variance, and the trajectory with the lowest cost significantly deviated from the mean of the distribution. This indicates that the trajectory distribution with similar costs was rather wide. In contrast, with online refinement in passive following training individualization, the converged assistive trajectory distribution had a smaller variance, suggesting that there was a more concentrated set of trajectories with similar costs. This concentration is largely attributable to the use of sensor feedback and anomaly scores in the online refinement, thereby facilitating real-time adjustments to the assistive trajectory. This process effectively reduced the uncertainty of motion intentions and minimized conflicts in human–robot interaction. Thus, the precision with which the cost function evaluated and distinguished the performance of different trajectories was increased, leading to the overall effectiveness of the training being enhanced.

The effectiveness of the individualized assistance was assessed using four metrics: the tracking RMSE, the anomaly score, the cost (35), and the EMG signal level of the biceps brachii. The results of this evaluation are displayed in Figure 20. To allow comparison, all metrics were averaged over one cycle and normalized relative to the original assistance. Compared with the other assistance (i.e., the mean trajectory of the demonstrations, and the optimized assistance without online refinement), the assistance incorporating online refinement demonstrated the best performance in improving tracking accuracy and reducing the anomaly score during motion. Furthermore, implementing online refinement resulted in the lowest cost following passive training and a reduction in the EMG signal level. These outcomes substantiate the efficacy of the proposed individualization framework for passive following training.

Table 9: Attributes of Participants

Group	Subject	Gender	Age(y)	Weight(kg)	Height(cm)	Arm length(cm)	Diagnosis
Control	1	Male	46	70	176	61	Cerebral hemorrhage
	2	Male	60	65	167	57	Stroke
	3	Male	62	75	170	58	Stroke
Experimental	4	Male	44	67	169	59	Stroke
	5	Male	51	77	177	64	Stroke
	6	Male	66	66	167	57	Stroke

We also recruited seven patients for clinical trials to demonstrate the effectiveness of the proposed method. As one participant was transferred to another hospital, only six participants completed the trial. All the patients had a healthy side that did not move naturally, due to the effects of their stroke. Hence, passive following training was employed. Each participant signed an informed consent form, and all experiments were approved by the ethics committee created by Shenzhen MileBot Robotics Co., Ltd in May 2023. The participants were allocated to either a control group or an experimental group and their details are summarized in Table 9. Both groups engaged in regular daily rehabilitation exercises, and the experimental group participated in an additional 14 days of passive following training.

The rehabilitation task was set to be the same as the previously described task of raising the arm, involving the coordinated movement of Joints 1, 2, and 4, as illustrated in Figure 21. Interactive information was recorded during the limb-lifting phase, which was considered as the training task, while the limb-lowering phase, which began at $t=3.5s$ , was used to return to the initial position. The motor abilities of both groups were evaluated and scored by professional healthcare personnel using specific evaluation metrics. These metrics were the muscle tone level and the Fugl-Meyer assessment (FMA) score (Fugl-Meyer et al., 1975). A low level of muscle tone and a high FMA score are indicative of good upper-limb motor ability. Both groups were subjected to a motor function assessment before treatment. During the treatment phase, the experimental group participated in daily passive following training sessions, each of which lasted approximately 15 minutes. Both groups underwent reassessment 2 weeks after the start of the treatment. The experimental results normalized to the initial evaluation are displayed in Figure 21, and detailed evaluation results are included in the Appendix. Unlike the control group, the experimental group exhibited significant improvements in all metrics compared with their initial assessments before the treatment. Thus, compared with the results of the control group, the results of the experimental group indicate that the passive following training with the upper-limb exoskeleton robot accelerated the recovery of motor functions. Therefore, this training could enhance the effectiveness of treatment for conditions such as stroke and cerebral hemorrhage.

8 Conclusion and Discussion

8.1 Conclusion

Overall: This paper introduces a dual-mode individualization framework that incorporates generative models. This framework incorporates an intention predictor and an anomaly detector, which are used to capture the motion intentions of the unaffected side of the patient and to assess the human–robot interaction in real time during rehabilitation tasks. In active mirroring mode, the assistance reflects the patient’s original motion intentions. In passive following mode, the assistance is tailored to the patient based on interactive feedback. Trajectories in both modes are integrated within an online trajectory refinement framework, ensuring that they are smooth, adhere to dynamic constraints, and are individualized, thereby effectively supporting the patient’s rehabilitation.

Details: The online trajectory refinement integrates both training configurations and utilizes generative models to achieve personalized assistance. Specifically, in active mirroring mode, the reference trajectory is derived from the unaffected limb, with the intention predictor providing a predicted trajectory distribution that is preemptively tuned to mitigate potential risk movements. Conversely, in passive following mode, the reference trajectory is pre-defined based on human demonstrations. Additionally, the anomaly detector plays a crucial role in guiding the online refinement process to enhance the naturalness of movements in real time. This detector assesses the deviation between the current interaction data and standard demonstration data obtained from healthy individuals, thereby facilitating performance evaluation during passive following training. In passive following mode, ProMPs are implemented for specific training tasks, with each movement of the patient weighted according to a cost function. This approach significantly enhances the effectiveness of the generated assistance distribution.

Performance: We conducted a series of experiments, including a clinical trial, to validate each of the proposed modules and demonstrate their effectiveness in enhancing assistance and ensuring safety. In terms of prediction accuracy, the intention predictor outperformed alternative methods, namely forward integration and a CNN-LSTM. Furthermore, the anomaly detector accurately identified anomalies across different scenarios. Moreover, the performance of the variable impedance controller was validated in trajectory tracking and in assisting the patient when anomalies occurred. During active mirroring training, online refinement effectively reduced the degree of constraint violations in the presence of unexpected impacts. It was also capable of identifying abnormal regions within the movement space and guiding the upper-limb exoskeleton robot to result in a decreased anomaly score. This active mirroring training approach was tested under a motion capture system, which validated its effectiveness. In passive following mode, testing was conducted using healthy individuals and in a clinical trial, respectively. The results confirm that the approach provided personalized assistance to healthy participants and significantly accelerated the recovery of motor functions in stroke participants. Specifically, the clinical trial data indicate that the experimental group, which had participated in passive following training, showed improvements in various performance metrics after completing the treatment protocol.

8.2 Discussion

Limitations: The current trajectory generation method exhibits three main limitations, as detailed below.

1)

The performance of the intention predictor and anomaly detector depends on the size and quality of the dataset. Expanding the dataset to include more subjects would significantly improve the performance of the generative models. This would increase the accuracy of predictions of patient motion intentions and the precision of detection of abnormal interactions during movements, thereby enhancing the personalization of the training modes.
2)

The clinical trial included only six participants and focused exclusively on the rehabilitation effects of the passive following mode. Conducting a clinical trial with more participants and incorporating active mirroring mode into the rehabilitation process would provide a more comprehensive evaluation of the proposed dual-mode individualization framework.

Efforts to address these limitations in the manner described will form the basis of our future research and development activities.

Intellectual Merits: We have developed an innovative dual-mode individualization framework that incorporates generative models, thereby establishing a new benchmark for adaptive rehabilitation systems. This novel framework can switch between active mirroring and passive following modes based on the patient’s needs and thus offers personalized assistance and enhanced rehabilitation outcomes. Key features of this framework are its real-time intention prediction and anomaly detection capabilities. Specifically, the intention predictor captures motion intentions from the unaffected side of the patient, while the anomaly detector evaluates human–robot interactions in real time, ensuring immediate adaptation and response to the patient’s movements. Additionally, the framework integrates online trajectory refinement that unifies trajectories from both active mirroring and passive following modes to ensure they are smooth, dynamically constrained, and individualized. Thus, the framework provides more natural and effective assistance than other frameworks. The application of generative models to personalize assistance based on interactive feedback ensures that the rehabilitation process is effectively responsive to individual patient conditions and needs.

Potential Impacts: The development of a dual-mode individualization framework that integrates generative models represents a significant advancement that could enhance the deployment and effectiveness of rehabilitation technologies in both clinical and homecare environments. Specifically, as this innovative framework delivers personalized and adaptive assistance tailored to real-time feedback and the specific motion intentions of the patient, it has the potential to revolutionize the rehabilitation process. To the best of the authors’ knowledge, this study is the first to integrate generative models into an upper-limb exoskeleton robot and perform a clinical trial. Our pioneering approach not only enhances the functionality of rehabilitation devices but also contributes to a potential impact on the field by merging artificial intelligence with rehabilitation medicine. That is, our approach could effectively bridge the gap between AI and rehabilitation medicine, thereby facilitating the translation of advancements in AI into practical medical applications. This study exemplified the power of interdisciplinary research, as it involved a combination of principles from the fields of robotics, control systems, machine learning, and clinical rehabilitation. This led to advances in each field and set a precedent for future studies aiming to develop comprehensive and adaptive rehabilitation systems. Furthermore, the framework devised in this study addresses the broader societal challenge posed by an aging population. Specifically, the framework offers methods that could be used in advanced rehabilitation solutions that are applicable in both healthcare facilities and home settings.

9 Funding

This work was supported in part by the Science and Technology Innovation 2030-Key Project under Grant 2021ZD0201404, in part by the Institute for Guo Qiang, Tsinghua University, and in part by the National Natural Science Foundation of China under Grant U21A20517 and 52075290.

References

Albu-Schäffer et al. (2007) Albu-Schäffer A, Ott C and Hirzinger G (2007) A unified passivity-based control framework for position, torque and impedance control of flexible joint robots. The International Journal of Robotics Research 26(1): 23–39.
Chen et al. (2019) Chen T, Casas R and Lum PS (2019) An elbow exoskeleton for upper limb rehabilitation with series elastic actuator and cable-driven differential. IEEE Transactions on Robotics 35(6): 1464–1474.
Chen et al. (2023a) Chen Y, Chen G, Ye J, Fu C, Liang B and Li X (2023a) Learning to assist different wearers in multitasks: Efficient and individualized human-in-the-loop adaption framework for exoskeleton robots. arXiv preprint arXiv:2309.14720 .
Chen et al. (2023b) Chen Y, Chen G, Ye J, Qiu X and Li X (2023b) Safe and individualized motion planning for upper-limb exoskeleton robots using human demonstration and interactive learning. arXiv preprint arXiv:2309.08178 .
Clark and Amor (2022) Clark G and Amor HB (2022) Learning ergonomic control in human–robot symbiotic walking. IEEE Transactions on Robotics 39(1): 327–342.
De Wit et al. (1995) De Wit CC, Olsson H, Astrom KJ and Lischinsky P (1995) A new model for control of systems with friction. IEEE Transactions on automatic control 40(3): 419–425.
Ebrahimi et al. (2017) Ebrahimi A, Gröninger D, Singer R and Schneider U (2017) Control parameter optimization of the actively powered upper body exoskeleton using subjective feedbacks. In: 2017 3rd international conference on control, automation and robotics (ICCAR). IEEE, pp. 432–437.
Fugl-Meyer et al. (1975) Fugl-Meyer AR, Jääskö L, Leyman I, Olsson S and Steglind S (1975) A method for evaluation of physical performance. Scand J Rehabil Med 7(1): 13–31.
Gull et al. (2020) Gull MA, Bai S and Bak T (2020) A review on design of upper limb exoskeletons. Robotics 9(1): 16.
Han et al. (2023) Han S, Wang H and Yu H (2023) Human–robot interaction evaluation-based aan control for upper limb rehabilitation robots driven by series elastic actuators. IEEE Transactions on Robotics .
Ho et al. (2020) Ho J, Jain A and Abbeel P (2020) Denoising diffusion probabilistic models. Advances in neural information processing systems 33: 6840–6851.
Huang et al. (2015) Huang J, Huo W, Xu W, Mohammed S and Amirat Y (2015) Control of upper-limb power-assist exoskeleton using a human-robot interface based on motion intention recognition. IEEE transactions on automation science and engineering 12(4): 1257–1270.
Huang and Krakauer (2009) Huang VS and Krakauer JW (2009) Robotic neurorehabilitation: a computational motor learning perspective. Journal of neuroengineering and rehabilitation 6: 1–13.
Jau (1988) Jau BM (1988) Anthropomorhic exoskeleton dual arm/hand telerobot controller. In: IEEE International Workshop on Intelligent Robots. IEEE, pp. 715–718.
Jezernik et al. (2004) Jezernik S, Colombo G and Morari M (2004) Automatic gait-pattern adaptation algorithms for rehabilitation with a 4-dof robotic orthosis. IEEE Transactions on Robotics and Automation 20(3): 574–582.
Kagawa et al. (2015) Kagawa T, Ishikawa H, Kato T, Sung C and Uno Y (2015) Optimization-based motion planning in joint space for walking assistance with wearable robot. IEEE Transactions on Robotics 31(2): 415–424.
Kalita et al. (2021) Kalita B, Narayan J and Dwivedy SK (2021) Development of active lower limb robotic-based orthosis and exoskeleton devices: a systematic review. International Journal of Social Robotics 13: 775–793.
Kim and Deshpande (2017) Kim B and Deshpande AD (2017) An upper-body rehabilitation exoskeleton harmony with an anatomical shoulder mechanism: Design, modeling, control, and performance evaluation. The International Journal of Robotics Research 36(4): 414–435.
Kim et al. (2012) Kim H, Miller LM, Byl N, Abrams GM and Rosen J (2012) Redundancy resolution of the human arm and an upper limb exoskeleton. IEEE transactions on biomedical engineering 59(6): 1770–1779.
Kim and Cho (2019) Kim TY and Cho SB (2019) Predicting residential energy consumption using cnn-lstm neural networks. Energy 182: 72–81.
Lanotte et al. (2021) Lanotte F, McKinney Z, Grazi L, Chen B, Crea S and Vitiello N (2021) Adaptive control method for dynamic synchronization of wearable robotic assistance to discrete movements: Validation for use case of lifting tasks. IEEE Transactions on Robotics 37(6): 2193–2209.
Lee et al. (2020) Lee SH, Park G, Cho DY, Kim HY, Lee JY, Kim S, Park SB and Shin JH (2020) Comparisons between end-effector and exoskeleton rehabilitation robots regarding upper extremity function among chronic stroke patients with moderate-to-severe upper limb impairment. Scientific reports 10(1): 1806.
Lenzi et al. (2012) Lenzi T, De Rossi SMM, Vitiello N and Carrozza MC (2012) Intention-based emg control for powered exoskeletons. IEEE transactions on biomedical engineering 59(8): 2180–2190.
Li et al. (2023a) Li H, Guo S, Bu D, Wang H and Kawanishi M (2023a) Subject-independent estimation of continuous movements using cnn-lstm for a home-based upper limb rehabilitation system. IEEE Robotics and Automation Letters .
Li et al. (2022) Li N, Yang Y, Li G, Yang T, Wang Y, Chen W, Yu P, Xue X, Zhang C, Wang W et al. (2022) Multi-sensor fusion-based mirror adaptive assist-as-needed control strategy of a soft exoskeleton for upper limb rehabilitation. IEEE Transactions on Automation Science and Engineering .
Li et al. (2018a) Li X, Liu Y and Yu H (2018a) Iterative learning impedance control for rehabilitation robots driven by series elastic actuators. Autom. 90: 1–7. URL https://api.semanticscholar.org/CorpusID:30545766.
Li et al. (2018b) Li X, Liu YH and Yu H (2018b) Iterative learning impedance control for rehabilitation robots driven by series elastic actuators. Automatica 90: 1–7.
Li et al. (2017) Li X, Pan Y, Chen G and Yu H (2017) Multi-modal control scheme for rehabilitation robotic exoskeletons. The International Journal of Robotics Research 36(5-7): 759–777.
Li et al. (2021) Li X, Zhang X, Li X, Long J, Li J, Xu L, Chen G and Ye J (2021) Bear-h: An intelligent bilateral exoskeletal assistive robot for smart rehabilitation. IEEE Robotics & Automation Magazine 29(3): 34–46.
Li et al. (2023b) Li Z, Li Q, Huang P, Xia H and Li G (2023b) Human-in-the-loop adaptive control of a soft exo-suit with actuator dynamics and ankle impedance adaptation. IEEE Transactions on Cybernetics .
Long et al. (2018) Long Y, Du Zj, Wang Wd and Dong W (2018) Human motion intent learning based motion assistance control for a wearable exoskeleton. Robotics and Computer-Integrated Manufacturing 49: 317–327.
Mao and Agrawal (2012) Mao Y and Agrawal SK (2012) Design of a cable-driven arm exoskeleton (carex) for neural rehabilitation. IEEE transactions on robotics 28(4): 922–931.
Martinez et al. (2018) Martinez A, Lawson B, Durrough C and Goldfarb M (2018) A velocity-field-based controller for assisting leg movement during walking with a bilateral hip and knee lower limb exoskeleton. IEEE Transactions on Robotics 35(2): 307–316.
Martinez et al. (2013) Martinez JA, Ng P, Lu S, Campagna MS and Celik O (2013) Design of wrist gimbal: A forearm and wrist exoskeleton for stroke rehabilitation. In: 2013 IEEE 13th international conference on rehabilitation robotics (ICORR). IEEE, pp. 1–6.
Nasr et al. (2023) Nasr A, Bell S and McPhee J (2023) Optimal design of active-passive shoulder exoskeletons: A computational modeling of human-robot interaction. Multibody System Dynamics 57(1): 73–106.
Pan et al. (2022) Pan J, Astarita D, Baldoni A, Dell’Agnello F, Crea S, Vitiello N and Trigili E (2022) Nesm- $\gamma$ : An upper-limb exoskeleton with compliant actuators for clinical deployment. IEEE Robotics and Automation Letters 7(3): 7708–7715.
Pan et al. (2017) Pan Y, Wang H, Li X and Yu H (2017) Adaptive command-filtered backstepping control of robot arms with compliant actuators. IEEE transactions on control systems technology 26(3): 1149–1156.
Paraschos et al. (2013) Paraschos A, Daniel C, Peters JR and Neumann G (2013) Probabilistic movement primitives. Advances in neural information processing systems 26.
Perry et al. (2007) Perry JC, Rosen J and Burns S (2007) Upper-limb powered exoskeleton design. IEEE/ASME transactions on mechatronics 12(4): 408–417.
Pratt and Williamson (1995) Pratt GA and Williamson MM (1995) Series elastic actuators. In: Proceedings 1995 IEEE/RSJ international conference on intelligent robots and systems. Human robot interaction and cooperative robots, volume 1. IEEE, pp. 399–406.
Qiu et al. (2020) Qiu S, Guo W, Caldwell D and Chen F (2020) Exoskeleton online learning and estimation of human walking intention based on dynamical movement primitives. IEEE Transactions on Cognitive and Developmental Systems 13(1): 67–79.
Quigley et al. (2009) Quigley M, Conley K, Gerkey B, Faust J, Foote T, Leibs J, Wheeler R, Ng AY et al. (2009) Ros: an open-source robot operating system. In: ICRA workshop on open source software, volume 3. Kobe, Japan, p. 5.
Said et al. (2022) Said RR, Heyat MBB, Song K, Tian C and Wu Z (2022) A systematic review of virtual reality and robot therapy as recent rehabilitation technologies using eeg-brain–computer interface based on movement-related cortical potentials. Biosensors 12(12): 1134.
Salzmann et al. (2020) Salzmann T, Ivanovic B, Chakravarty P and Pavone M (2020) Trajectron++: Dynamically-feasible trajectory forecasting with heterogeneous data. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XVIII 16. Springer, pp. 683–700.
Sambhus et al. (2023) Sambhus R, Gokce A, Welch S, Herron CW and Leonessa A (2023) Real-time model-free deep reinforcement learning for force control of a series elastic actuator. In: 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, pp. 5645–5652.
Sommerhalder et al. (2023) Sommerhalder M, Zimmermann Y, Simovic L, Hutter M, Wolf P and Riener R (2023) Trajectory optimization framework for rehabilitation robots with multi-workspace objectives and constraints. IEEE Robotics and Automation Letters .
Song et al. (2020) Song J, Meng C and Ermon S (2020) Denoising diffusion implicit models. arXiv preprint arXiv:2010.02502 .
Spong (1987) Spong MW (1987) Modeling and control of elastic joint robots .
Tiboni et al. (2022) Tiboni M, Borboni A, Vérité F, Bregoli C and Amici C (2022) Sensors and actuation technologies in exoskeletons: A review. Sensors 22(3): 884.
Tikhonov (1952) Tikhonov AN (1952) Systems of differential equations containing small parameters in the derivatives. Matematicheskii sbornik 73(3): 575–586.
Wang et al. (2022) Wang Y, Zahedi A, Zhao Y and Zhang D (2022) Extracting human-exoskeleton interaction torque for cable-driven upper-limb exoskeleton equipped with torque sensors. IEEE/ASME Transactions on mechatronics 27(6): 4269–4280.
Williams et al. (2017) Williams G, Wagener N, Goldfain B, Drews P, Rehg JM, Boots B and Theodorou EA (2017) Information theoretic mpc for model-based reinforcement learning. In: 2017 IEEE international conference on robotics and automation (ICRA). IEEE, pp. 1714–1721.
Wu et al. (2016) Wu Q, Wang X and Du F (2016) Development and analysis of a gravity-balanced exoskeleton for active rehabilitation training of upper limb. Proceedings of the Institution of Mechanical Engineers, Part C: Journal of Mechanical Engineering Science 230(20): 3777–3790.
Xu et al. (2020) Xu J, Xu L, Li Y, Cheng G, Shi J, Liu J and Chen S (2020) A multi-channel reinforcement learning framework for robotic mirror therapy. IEEE Robotics and Automation Letters 5(4): 5385–5392.
Xu et al. (2023) Xu M, Zhou Z, Wang Z, Ruan L, Mai J and Wang Q (2023) Bio-inspired cable-driven actuation system for wearable robotic devices: Design, control and characterization. IEEE Transactions on Robotics .
Zhang et al. (2017) Zhang J, Fiers P, Witte KA, Jackson RW, Poggensee KL, Atkeson CG and Collins SH (2017) Human-in-the-loop optimization of exoskeleton assistance during walking. Science 356(6344): 1280–1284.
Zhang et al. (2022) Zhang Q, Nalam V, Tu X, Li M, Si J, Lewek MD and Huang HH (2022) Imposing healthy hip motion pattern and range by exoskeleton control for individualized assistance. IEEE Robotics and Automation Letters 7(4): 11126–11133.
Zhang et al. (2023) Zhang X, Shu Y, Chen Y, Chen G, Ye J, Li X and Li X (2023) Multi-modal learning and relaxation of physical conflict for an exoskeleton robot with proprioceptive perception. In: 2023 IEEE International Conference on Robotics and Automation (ICRA). IEEE, pp. 10490–10496.
Zhu et al. (2021) Zhu H, Nesler C, Divekar N, Peddinti V and Gregg RD (2021) Design principles for compact, backdrivable actuation in partial-assist powered knee orthoses. IEEE/ASME Transactions On Mechatronics 26(6): 3104–3115.
Zhu et al. (2020) Zhu L, Wang Z, Ning Z, Zhang Y, Liu Y, Cao W, Wu X and Chen C (2020) A novel motion intention recognition approach for soft exoskeleton via imu. Electronics 9(12): 2176.
Zimmermann et al. (2020) Zimmermann Y, Küçüktabak EB, Farshidian F, Riener R and Hutter M (2020) Towards dynamic transparency: Robust interaction force tracking using multi-sensory control on an arm exoskeleton. In: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, pp. 7417–7424.
Zimmermann et al. (2023a) Zimmermann Y, Sommerhalder M, Wolf P, Riener R and Hutter M (2023a) Anyexo 2.0: A fully actuated upper-limb exoskeleton for manipulation and joint-oriented training in all stages of rehabilitation. IEEE Transactions on Robotics .
Zimmermann et al. (2023b) Zimmermann Y, Song J, Deguelle C, Läderach J, Zhou L, Hutter M, Riener R and Wolf P (2023b) Human–robot attachment system for exoskeletons: Design and performance analysis. IEEE Transactions on Robotics .

Table 10: Detailed Clinical Evaluation

Group	Subject	Initial muscle tone level	Post-trial muscle tone level	Initial FMA	Post-trial FMA
Control	1	1.5	1.5	8	12
	2	1.5	1.5	31	35
	3	2	1.5	23	30
Experimental	4	1.5	1	8	16
	5	2	1.5	19	35
	6	2	2	3	5

10 Appendix

10.1 Stability Analysis

To prove the stability of the whole system, we substitute equations (45) and (47) into (9), resulting in the following system dynamics:

\displaystyle(\bm{M}(\bm{q})+\bar{\bm{B}})\dot{\bm{z}}+\bm{C}(\dot{\bm{q}},\bm{q})\bm{z}=-\bm{K}_{z}\bm{z}-\bm{S}_{2}^{\mathsf{T}}\tilde{\bm{\tau}}_{f}-k_{g}{\bm{sgn}}(\bm{z}),

(54)

where $\tilde{\bm{\tau}}_{f}=\hat{\bm{\tau}}_{f}-{\bm{\tau}}_{f}$ represents the estimation error of the friction.

Subsequently, we propose the following candidate Lyapunov function:

\displaystyle V=\frac{1}{2}\bm{z}^{T}(\bm{M}(\bm{q})+\bar{\bm{B}})\bm{z}.

(55)

By differentiating (55) with respect to time and substituting the dynamics with those from (54), we derive the following expression:

\displaystyle\begin{array}[]{*{20}{l}}\dot{V}=-\bm{z}^{T}\bm{K}_{z}\bm{z}-\bm{z}^{T}\bm{S}_{2}^{\mathsf{T}}\tilde{{\bm{\tau}}}_{f}-k_{g}\bm{z}^{T}\bm{sgn}(\bm{z}).\end{array}

(57)

Assuming that $\|\bm{S}_{2}^{\mathsf{T}}\tilde{{\bm{\tau}}}_{f}\|\leq\kappa$ , the upper bound for $\dot{V}$ is derived as follows:

\displaystyle\dot{V}\leq

\displaystyle-\bm{z}^{T}\bm{K}_{z}\bm{z}-(k_{g}-\kappa)\|\bm{z}\|,

(58)

If $k_{g}$ is adequately large such that $k_{g}>\kappa$ , the inequality simplifies to

\displaystyle\dot{V}\leq

\displaystyle-\bm{z}^{T}\bm{K}_{z}\bm{z}<0,

(59)

Given that $V>0$ and $\dot{V}<0$ , the quasi-steady-state system is exponentially stable. Considering that the boundary-layer system can be made intrinsically stable by appropriate tuning of $\bm{K}_{1}$ and $\bm{K}_{2}$ , the stability of the closed-loop system is assured according to Tikhonov (1952), ensuring convergence to the desired impedance vector.

10.2 Weight Setting in Passive Following Mode

For a specific training task involving human-robot interaction, we assume that the patient intends to synchronize movement with the upper-limb exoskeleton robot. However, physical factors such as the randomness of human motion intentions may impact the tracking performance of the upper-limb exoskeleton robot. Given the challenge of explicitly accounting for such disturbances within the dynamics, we model this interference as additional noise in trajectory planning. The dynamics are described as follows:

	$\displaystyle\bm{x}_{p}^{(t+1)}$	$\displaystyle=g_{p}(\bm{x}_{p}^{(t)},\bm{q}_{r}),$		(60)
	$\displaystyle\bm{q}_{r}$	$\displaystyle\sim\mathcal{N}(\bm{q}_{rp},\bm{\Sigma}_{p}),$		(61)

where $\bm{x}_{p}=[\bm{x}_{d}^{\mathsf{T}},s]^{\mathsf{T}}$ is an augmented state vector, $g_{p}(\cdot)$ is a nonlinear time-variant function that integrates online refinement, the impedance controller, and deterministic components of human motion intention with human-robot interaction. $\bm{q}_{rp}$ is the mean of the assistance distribution, and $\Sigma_{p}$ encapsulates the overall stochastic disturbance, which includes the randomness of human motion intentions and the sampling variability of generative models.

According to Williams et al. (2017), to minimize the following cost

\displaystyle\hat{\mathcal{S}}(\bm{q}_{r})=\int_{T_{r}}\{\|\bm{q}_{d}(\bm{q}_{r})-\bm{q}\|_{\bm{Q}}^{2}+s^{2}+\frac{\lambda_{p}}{2}\bm{q}_{rp}^{\mathsf{T}}\Sigma_{p}^{-1}\bm{q}_{rp}\}dt,

(62)

where the optimal control input is structured in a cost-decoupled manner as follows:

	$\displaystyle\hat{w}(\bm{q}_{r})$	$\displaystyle=\frac{1}{\hat{\eta}_{k}}\exp(-\frac{1}{\lambda_{p}}(\hat{\mathcal{S}}(\bm{q}_{r})+\gamma_{p}\sum\tilde{\bm{q}}_{rp}^{\mathsf{T}}\Sigma_{p}^{-1}\bm{q}_{r})),$		(63)
	$\displaystyle\hat{\bm{q}}_{rp}^{*}$	$\displaystyle=\mathbb{E}[\hat{w}(\bm{q}_{r})\bm{q}_{r}].$		(64)

Here, $\gamma_{p}=\lambda_{p}(1-\alpha_{p})$ is the decoupled temperature parameter, defined with $\alpha_{p}\in[0,1]$ , and $\tilde{\bm{q}}_{rp}$ is the mean of the initial trajectory estimates.

In this approach, the optimal assistance is derived from the expected value of the current trajectory distribution. Each sampling iteration is based on the previously improved trajectory distribution, which may lead to inconsistencies in the sampling space. To address these potential inconsistencies and ensure adequate exploration, we opt for a large sample size $N_{s}$ in Algorithm 3. Additionally, we set $\alpha_{p}=1$ to mitigate the influence of inconsistencies in $\tilde{\bm{q}}_{rp}$ across different samples and thereby enhance the robustness and reliability of the optimal control solution.

Given that the last term in (62) is unexpected in the rehabilitation process, the ideal set is $\lambda_{p}=0$ , leading to the following definition of optimal assistance:

\displaystyle\hat{\bm{q}}_{rp}^{*}

\displaystyle=\mathop{\arg\min}\limits_{\bm{q}_{r}}\hat{\mathcal{S}}(\bm{q}_{r}).

(65)

In this scenario, the optimal assistance is identified as the sampled trajectory with the lowest cost, and all other sampled outcomes are disregarded. However, this formulation disregards the distribution of the optimal assistance, rendering the iterative improvement mechanism nonviable. Thus, to maintain the feasibility of iterative improvements and also minimize the impact of the undesirable term, the parameter $\lambda_{p}$ , which governs the tightness of the solution, is set to a small value. This adjustment ensures that $\hat{\mathcal{S}}(\bm{q}_{r})\rightarrow{\mathcal{S}}(\bm{q}_{r})$ , and the optimal weight formula in (63) simplifies to the configuration used in (36). This setting balances the need to minimize the undesired term with the need to maintain a practical and effective iterative improvement process.

10.3 Clinical Evaluation Results

In this clinical trial, muscle tone levels were assessed using a rating scale with grades 0, 1, 1+, 2, 3, and 4, where grade 0 indicates normal muscle tone. For the purposes of numerical analysis, grade 1+ is quantified as 1.5. The evaluation results are documented in Table 10. All assessments were exclusively focused on the upper limb. In the upper limb segment of the FMA, the maximum score, indicating normal function, is 66 points.