Machine Learning based control of Exoskeletons using Action Diffusion Policy

PyTorch | Machine Learning | Diffusion Policy | Learning-from-Demonstration | Python | ROS | Hugging Face

Project Overview

During my internship at the Shirley Ryan Ability Lab, I developed a machine learning model based on the Action Diffusion paradigm to mimic the behavior of a physical therapist in a dyadic rehabilitation setup. The setup involves a patient and therapist, both wearing Fourier ExoMotus X2 lower-limb exoskeletons, connected via a virtual spring-damper system. The therapist provides corrective gait assistance to the patient through this connection. The goal of the project was to replace the therapist with a model that can predict and replicate these corrective actions in real-time, thereby automating the rehabilitation process. The Action Diffusion model was trained on joint state data from multiple rehabilitation sessions, enabling it to predict the therapist’s corrective actions based on the patient’s movement over a short time window.

Action Diffusion Policy automating the prediction of therapist corrective actions for patient gait rehabilitation.

Problem Statement and Objective

In traditional physical therapy, patients with motor impairments rely on therapists to assist in gait rehabilitation. The therapist actively helps the patient achieve correct foot placement and overall balance while walking. In dyadic rehabilitation setups involving exoskeletons, the therapist and patient are connected through a virtual spring-damper system, where the therapist applies corrective forces to guide the patient’s gait. While effective, this approach depends heavily on human intervention, making it less scalable and difficult to personalize across multiple sessions.

 

The goal of this project was to develop a machine learning model capable of predicting and executing the therapist’s corrective actions in real time, automating the rehabilitation process. By analyzing past interactions, the model provides personalized assistance to the patient, enabling therapist-free rehabilitation sessions tailored to the individual.

System Overview

This project uses two Fourier ExoMotus X2 lower-limb exoskeletons connected via a virtual spring-damper system, allowing the therapist to assist the patient by applying corrective forces. The exoskeletons feature four active degrees of freedom at the hip and knee joints for precise motion control.

The system operates on a ROS-based framework with communication handled via the CANOpenRobotController protocol over a CAN bus. Joint positions from the patient and therapist are streamed as ROS messages and processed to deliver real-time torque commands to the patient’s exoskeleton, ensuring synchronized, dynamic rehabilitation.

ExoMotus-X2 lower-limb exoskeleton
Data Collection and Preprocessing

Data was collected from multiple rehabilitation sessions between a single patient-therapist pair, recording joint positions at 66.66 Hz. This high-frequency data captured the interactions between the exoskeletons, focusing on hip and knee joint angles.

The data was preprocessed into 1.5-second observation windows for input, with the model predicting the therapist’s corrective actions for the next 3 seconds. Overlapping data segments ensured consistent input, while noise filtering provided clean, reliable information for the model.

Action Diffusion Model Architecture

The Action Diffusion model is the heart of this project, specifically designed to predict the therapist’s corrective actions in a dynamic rehabilitation setup. The model leverages a Denoising Diffusion Probabilistic Model (DDPM), which excels in learning from demonstration data, particularly in multimodal and time-sensitive environments like this one. This architecture is well-suited for the task of learning from the therapist-patient interaction, where the goal is to predict complex corrective actions over a series of time steps based on a short observation window.

Input and Model Architecture

The model’s input consists of the patient’s joint state data, including hip and knee joint angles, captured at 66.66 Hz. The U-Net architecture, known for its ability to efficiently capture both local and global patterns, is trained to learn the denoising process and generate predictions. The U-Net employs a series of down-sampling and up-sampling layers, which allows the model to capture detailed features while maintaining an overall understanding of the patient’s movement. The U-Net-based Action Diffusion model excels at predicting multimodal actions and handling the variability in patient movement.

 
Action Diffusion Process

At the core of the model lies the denoising diffusion process, where noisy action trajectories are refined into structured predictions. The model is capable of generating multiple possible corrective actions, capturing the multimodal nature of the therapist’s behavior. For instance, depending on subtle variations in the patient’s gait (range of motion, foot-strike timing, etc), the therapist may apply different corrective forces, and the model learns these variations from the training data through the diffusion process.

To ensure smooth, real-time execution, the model processes inputs in a separate thread as soon as a new observation window is available. This allows for continuous overlapping predictions, ensuring the system can provide corrective actions without delay. The model is integrated into a ROS-based framework, where it continuously receives patient data and sends predicted corrective actions to the exoskeleton for real-time control.

By automating the therapist’s role, the model enables personalized and scalable rehabilitation, providing corrective forces in real time with high accuracy.

 
Model Training

The model was trained on data collected from dyadic-rehabilitation sessions. It quickly converged within a few epochs, achieving high accuracy in predicting therapist actions based on the patient’s joint data. Mean squared error (MSE) was used as the primary loss function to guide training, to ensure that the model minimized the error between predicted and actual therapist actions.

Action Diffusion Process Overview. The model takes the latest \(t_o\) steps of observation data and outputs \(t_a\) steps of actions.
Mean Squared Error Loss Curve. The training loss continues to drop while the test loss increases after 12 epochs indicating overfit.
Results and Observations

The Action Diffusion model successfully learned to mimic the therapist’s corrective actions, predicting with high accuracy based on 1.5-second windows of patient joint data. Evaluation on test data showed that the model’s predictions closely matched the actual therapist actions, both in timing and magnitude.

The model was able to generalize well within the scope of the training data, though it showed signs of overfitting to the specific patient’s gait pattern, making its predictions less accurate when applied to new individuals. Train/test loss curves clearly demonstrated the model’s learning process, with early convergence and later signs of overfitting, which suggests a need for broader data to improve generalization.

Sample inference from the model compared against the ground truth therapist action from the test dataset.

Real-Time Prediction based on Patient Exoskeleton Joint Data by a Trained Model. The four windows represent the 4 joints of the exoskeleton (2 Hip, 2 Knee).

The periodic breaks in the predicted actions result from the model producing a new inference.

Future Work

While the Action Diffusion model shows strong potential in automating therapist behavior for dyadic rehabilitation, several areas for improvement remain. One key challenge observed was overfitting to the specific patient’s gait, which reduced the model’s ability to generalize to other patients. Future work should focus on expanding the dataset to include a broader range of patient-therapist interactions, capturing different levels of motor impairment and variability in walking patterns. This will improve the model’s ability to adapt to diverse rehabilitation scenarios.

 

Another promising direction is the integration of adaptive learning mechanisms, allowing the model to continuously learn and update based on real-time patient performance. This would enable the system to personalize the corrective actions dynamically, tailoring the rehabilitation to the patient’s evolving needs throughout the therapy session.

 

Further, testing the system in more complex environments, including overground walking and other varied rehabilitation tasks, will help validate the model’s robustness in real-world settings. Expanding this model for broader applications could greatly improve personalized rehabilitation, enabling scalable solutions beyond clinical environments.

Acknowledgements

Diffusion Policy: Visuomotor Policy Learning via Action Diffusion – paper

Original Diffusion Policy repository – real-stanford/diffusion_policy

Nick Morales Diffusion Policy repository – ngmor/diffusion_policy

Virtual Physical Coupling of Two Lower-Limb Exoskeletons – paper

Exoskeleton Control using ROS – emekBaris/CANOpenRobotController

Scroll to Top