Actuation Models
We currently have two models with different runtime vs accuracy trade-offs. Both approaches use the MuJoCo actuators
belonging to MIMo, but have different internal approaches to determining the output torque of these actuators. The
first, SpringDamperModel, uses a spring-damper system to
approximate force-length and force-velocity relationships. The second, MuscleModel,
models each actuator as two opposing, independently controllable muscles. Compared to each other, the Spring-Damper
Model is faster but less accurate, especially with regards to compliance.
In addition there is a “Positional” actuation model, which can be used to pose MIMo. The input action in this case is an
array of joint angles into which MIMo’s joints are locked.
Abstract base class for MIMo's actuation model. |
|
Class for the Spring-Damper actuation model. |
|
This model allows posing MIMo or moving his joints along pre-determined trajectories . |
|
Functions in this model are designed to help transition actuation models for MIMo. |
mimoActuation.actuation
This module defines the actuation model interface and provides two implementations.
The interface is defined as an abstract class in ActuationModel.
The spring-damper model is defined in TorqueMotorModel.
A second implementation using direct positional control is PositionalModel.
- class mimoActuation.actuation.ActuationModel(env, actuators, *args)
Bases:
objectAbstract base class for MIMo’s actuation model.
This class defines the functions that all implementing classes must provide.
Control inputs have two conceptual levels: The desired control input (i.e. maximum output in one direction), and the actual control input to the simulation motors. In the simulation the motor response is linear and instantaneous, but this may not be desired. Actuation models can model time-dependent or non-linear torque generation by taking the desired control input and altering it before passing it to the simulation. Actuation models can define an arbitrary control method, but must compute control inputs for the actual simulation motors as defined in the XMLs.
The key functions are:
get_action_space()determines the actuation space attribute for the gym environment. This should have the shape of the input to the abstract model motors.action()computes the actual control inputs to the simulation motors from a control input to the abstract motors.substep_update()is called on every physics step and allows torques to be updated between environment steps.observations()should return any actuation-related quantities that could reasonably be used as observations for the gym environment. Note that these will only actually be included if the proprioception module is appropriately configured.cost()should return the cost of the current activations. This can represent the metabolic cost or an action penalty. This function is not used by default, but environments may use it as they wish, for example during reward calculation.reset()should reset whatever internal quantities the model uses to the value at the start of the simulation.
- Parameters
env (MIMoEnv) – The environment to which this model will be attached.
actuators (np.ndarray) – An array with the actuators, by ID, to include in this model.
- env
The environment to which this module will be attached.
- Type
gym.Env
- actuators
The simulation motors, by ID, to include in this model.
- Type
np.ndarray
- action_space
The action space for this model. This is set by
get_action_space()- Type
spaces.Box
- get_action_space()
Determines the actuation space attribute for the gym environment.
Note that his action space must be a Box!
- Returns
A gym spaces object with the actuation space.
- Return type
gym.spaces.Box
- action(action)
Converts abstract control inputs into actual motor inputs.
This function is called during every environment step and sets the actual motor inputs for each included actuator.
- Parameters
action (numpy.ndarray) – A numpy array with control values.
- substep_update()
Like action, but called on every physics step instead of every environment step.
This allows for torques to be updated every physics step.
- observations()
Collect any quantities for the observations.
- Returns
A flat numpy array with these quantities.
- Return type
np.ndarray
- cost()
Returns the “cost” of the current action.
This function may be used as an action penalty.
- Returns
The cost of the action.
- Return type
- reset()
Reset actuation model to the initial state.
- class mimoActuation.actuation.SpringDamperModel(env, actuators)
Bases:
mimoActuation.actuation.ActuationModelClass for the Spring-Damper actuation model.
In this model, MIMo’s muscles are represented by torque motors with linear and instantaneous control response, i.e. the abstract model directly matches the in-simulation definitions. The force-velocity and force-length relationships of real muscles is approximated using damping and spring components in the joint definitions of MIMo. The maximum torque of the motors is set to the maximum voluntary isometric torque along the corresponding axis, with a control input of 1 representing maximum torque.
In addition to the attributes from the base actuation class, there are two extra attributes:
- control_input
Contains the current control input.
- Type
np.ndarray
- max_torque
The maximum motor torques.
- Type
np.ndarray
- get_action_space()
Determines the actuation space attribute for the gym environment.
The actuation space directly corresponds to the control range of the simulations motors. Unless modified, this will be [-1, 1] for all motors.
- Returns
The actuation space.
- Return type
gym.spaces.Space
- action(action)
Set the control inputs for the next step.
Control values are clipped to the control range limits defined the MuJoCo XMLs and normalized to be even in both directions, i.e. an input of 0 corresponds to the center of the control range, rather than the default or neutral control position. The control ranges for the MIMo XMLs are set up to be symmetrical, such that an input of 0 corresponds to no motor torque.
- Parameters
action (numpy.ndarray) – A numpy array with control values.
- observations()
Control input and output torque for each motor at this time step.
- Returns
A flat array with the control inputs and output torques.
- Return type
np.ndarray
- cost()
Provides a cost function for current motor usage.
The cost is given by given by \(\sum_{i=1}^n \frac{u_i^2 * T_{max_i}}{n \sum_{i=1}^n T_{max_i}}\), where \(u_i\) and \(T_{max_i}\) are the control signal and maximum motor torque of motor \(i\), respectively, and \(n\) is the number of motors in the model.
- Returns
The cost as described above.
- Return type
- simulation_torque()
Computes the currently applied torque for each motor in the simulation.
- Returns
An array with applied torques for each motor.
- Return type
np.ndarray
- reset()
Reset actuation model to the initial state.
- class mimoActuation.actuation.PositionalModel(env, actuators)
Bases:
mimoActuation.actuation.ActuationModelThis model allows posing MIMo or moving his joints along pre-determined trajectories .
The ‘action’ input represents desired joint positions. MIMo will be locked into these at each timestep. Unlike the other actuation models this doesn’t use the MuJoCo actuators in the scene but instead adjusts the equality constraints used to lock each joint into position. To determine which joints should be included we use the joints associated with the actuators in the ‘actuators’ parameter. Note that this requires that there is an equality constraint in the XMLs for each actuated joint. This is true for MIMo by default.
In addition to the attributes from the base actuation class, there is three extra attributes.
- control_input
Contains the current control input.
- Type
np.ndarray
- actuated_joints
Contains an array of joint IDs associated with the actuators.
- Type
np.npdarray
- constraints
Contains an array of constraint IDs belonging to the joints in ‘actuated_joints’.
- Type
np.ndarray
- get_constraints()
Collects the constraints associated with the actuated joints in the scene.
- Returns
An array with the constraint IDs.
- Return type
np.ndarray
- get_action_space()
Determines the actuation space attribute for the gym environment.
The actuation space directly corresponds to the range of motion of the joints in radians.
- Returns
A gym spaces object with the actuation space.
- Return type
gym.space.Spaces
- action(action)
Locks the joints into the positions provided by ‘action’.
Control values are clipped to the joint range of motion.
- Parameters
action (numpy.ndarray) – A numpy array with desired joint positions.
- observations()
Returns the current control input, i.e. the locked positions.
- Returns
A flat numpy array with the control input.
- Return type
np.ndarray
- reset()
Reset actuation model to the initial state.
mimoActuation.muscle
This module defines the base class for the muscle actuation model.
Authors: Pierre Schumacher, Dominik Mattern
- class mimoActuation.muscle.MuscleModel(env, actuators)
Bases:
mimoActuation.actuation.ActuationModelClass for the muscle actuation model.
Implementation of the muscle model as seen in https://arxiv.org/abs/2207.03952. Each actuator is internally modeled as two opposing muscles. These follow the force-length and force-velocity curves as described in the paper. Torque is applied in the simulation by setting the gear ratio to the computed output torque and applying a dummy control signal of 1. There are many parameters in this model, two of which were tweaked for MIMo specifically. The function used for this is
calibrate_full().This model loads and modifies data from the actuators and joints the MIMo XML, which are effectively part of the specifications for this model. Changing them before this model is initialized might have unintended consequences for the actuation!
- tau
Time constant for the activity. A higher tau means muscle activity takes longer to build up to the control signal.
- Type
- fmax
Force multiplier to translate the normalised force-length and force-velocity curves into appropriate ranges.
- Type
float|np.ndarray
- vmax
Reference velocity for the force-velocity curve. A higher vmax leads to increased force at high virtual muscle velocities.
- Type
float|np.ndarray
- target_activity
Current control input.
activitywill approach this value over time.- Type
np.ndarray
- activity
Muscle activity.
- Type
np.ndarray
- get_action_space()
Determines the actuation space attribute for the gym environment.
The actuation space consists of two opposing muscles for each motor in the simulation, each with range [0, 1].
- Returns
A gym spaces object with the actuation space.
- Return type
spaces.Space
- action(action)
Set the control inputs for the next step.
Input values are clipped to the action space.
- Parameters
action (numpy.ndarray) – A numpy array with control values.
- substep_update()
Update muscle activity and torque.
As activity is time-dependent we update activity and the output torque every physics step. The desired activity level (input action) is not changed during this.
- observations()
Returns muscle activations and forces for every actuator.
- Returns
A flat array with the quantities described above.
- Return type
np.ndarray
- cost()
Approximates the metabolic cost of muscle activations.
Currently, it is given by \(\sum_{i=1}^n \frac{m_{a_i}^2 * f_{max_i}}{n \sum_{i=1}^n f_{max_i}}\), where \(m_{a_i}\) and \(f_{max_i}\) are the activation and the maximum isometric muscle force of muscle \(i\), respectively, and \(n\) is the number of muscles in the model.
- Returns
The actuation cost.
- Return type
- reset()
Set activity to zero and recompute muscle quantities.
- property muscle_activations
Activity for every muscle.
- Returns
An array with copies of the activity for every muscle.
- Return type
np.ndarray
- property muscle_lengths
Virtual muscle lengths for all muscles.
- Returns
An array with copies of the virtual muscle lengths.
- Return type
np.ndarray
- property muscle_velocities
Virtual muscle speeds for all muscles.
- Returns
An array with copies of the virtual muscle velocities.
- Return type
np.ndarray
- property muscle_forces
Muscle force vectors.
- Returns
An array with copies of the forces applied by each muscle.
- Return type
np.ndarray
- fl(lce)
Force length curve as implemented by MuJoCo.
- Parameters
lce (np.ndarray) – Virtual muscle lengths for MIMo.
- Returns
An array with the force-length multipliers.
- Return type
np.ndarray
- fv(lce_dot)
Force length curve as implemented by MuJoCo.
- Parameters
lce_dot (np.ndarray) – Virtual muscle velocities for MIMo.
- Returns
An array with the force-velocity multipliers.
- Return type
np.ndarray
- fp(lce)
Parallel elasticity (passive muscle force) as implemented by MuJoCo.
- Parameters
lce (np.ndarray) – Virtual muscle lengths for MIMo.
- Returns
An array with the passive force components.
- Return type
np.ndarray
- simulation_torque()
Computes the currently applied torque for each motor in the simulation.
- Returns
A numpy array with applied torques for each motor.
- Return type
np.ndarray
- collect_data_for_actuators()
Collect all muscle related values at the current timestep for all of MIMo’s actuators.
- Returns
A list containing the joint position and velocity, corrected position, output torque, desired target muscle activity, actual current muscle activity, virtual muscle length, virtual muscle velocity, muscle force, FL factor, FV factor and the FP component for all muscles.
- Return type
List[np.ndarray]
- mimoActuation.muscle.bump(length, a, mid, b)
Part of the force length relationship as implemented by MuJoCo.
The parameters a, mid and b define the shape of the force-length curve. See https://arxiv.org/abs/2207.03952 for more details.
- Parameters
- Returns
Resulting force-length multiplier.
- Return type
np.ndarray
mimoActuation.muscle_testing
Functions in this model are designed to help transition actuation models for MIMo.
By default, MIMo uses direct torque motors for actuation with the maximum torques corresponding to the maximum voluntary isometric torques. A second actuation model exists based on https://arxiv.org/abs/2207.03952. This second model more accurately represents the position and velocity dependent force generating behaviour of real muscles. The second model requires several adjustments to the actuation and joint parameters, which can be done using the functions in this module.
- mimoActuation.muscle_testing.vectorized(fn)
Simple vector wrapper for functions that clearly came from C.
- mimoActuation.muscle_testing.fl(vec)
- mimoActuation.muscle_testing.bump(length, a, mid, b)
Part of the force length relationship as implemented by MuJoCo.
The parameters a, mid and b define the shape of the force-length curve. See https://arxiv.org/abs/2207.03952 for more details.
- Parameters
- Returns
Resulting force-length multiplier.
- Return type
np.ndarray
- mimoActuation.muscle_testing.fp(vec)
- mimoActuation.muscle_testing.fv_vec(lce_dot, vmax)
Force-velocity curve.
- Parameters
lce_dot (np.ndarray) – Array with virtual muscle velocities.
vmax (np.ndarray|float) – Array or float with the VMAX value.
- Returns
The corresponding force-velocity multipliers.
- Return type
np.ndarray
- mimoActuation.muscle_testing.force_vel_v_vec(velocity, c, vmax, fvmax)
Force velocity relationship as implemented by MuJoCo.
- Parameters
velocity (np.ndarray) – Array with virtual muscle velocities.
c (float) – Virtual velocity at which the curve is 1. Determines the shape of the curve.
vmax (np.ndarray|float) – Scaling factor VMAX. Determines the shape of the curve.
fvmax (float) – Maximum multiplier due to velocity. Determines the shape of the curve.
- Returns
The corresponding force-velocity multipliers.
- Return type
np.ndarray
- mimoActuation.muscle_testing.vmax_calibration(env_name, n_episodes, save_dir, lr=0.1, lr_decay=0.8, decay_lr_every=100, make_plots=True)
Iteratively calibrate VMAX parameters for the muscle model.
We determine VMAX with an iterative procedure. Using an initial value we take random actions and measure the maximum achieved joint velocity. The initial VMAX values are then updated using learning rate lr and we continue with more random actions. The learning rate is updated every decay_lr_every episodes by factor lr_decay. The procedure continues for n_episodes episodes. Optionally VMAX can be plotted for every step by setting make_plots to
True. We use the environment as provided by env_name. For MIMo these are fixed environments in which MIMo is hovering in the air with gravity disabled entirely. Muscle actions do not use the full range of inputs, instead we randomly set maximum or minimum inputs with no in between. The final VMAX values are saved to a file “vmax.npy” in the plotting directory.- Parameters
env_name (str) – The name of the environment to be used for the calibration. Must use the muscle model.
n_episodes (int) – The total number of episodes.
save_dir (str) – The directory where the final VMAX and any plots will be saved.
lr (float) – The learning rate used to update VMAX every episode. Default 0.1.
lr_decay (float) – The learning rate is multiplied by this factor every decay_lr_every episodes. Default 0.8.
decay_lr_every (int) – How often the learning rate is updated. Default 100.
make_plots (bool) – If
Truewe plot the change in VMAX over time and save as a file in the plotting directory. DefaultTrue.
- Returns
A numpy array with the final VMAX values.
- Return type
np.ndarray
- mimoActuation.muscle_testing.fmax_calibration(env_name, save_dir, n_iterations=3, make_plots=True)
Calibrate FMAX parameters for the muscle model.
The calibration procedure is as follows: We take the desired maximum force values from the actuator definitions in the scene XML. We then apply maximum control input in one direction for 500 steps, back off for 500 steps, and then maximum input in the opposite direction for 500 steps. The maximum torque actually generated during this is recorded for each direction and compared against the desired values. The FMAX parameter is then adjusted such that the generated and desired torques match. This is performed iteratively as all MuJoCo constraints are soft and even locked joints will change position slightly based on applied torque.
This method requires a specialised scene to measure maximum voluntary isometric muscle torque in which all joints are locked in the angle at which the torque is to be measured. Note also that if the initial FMAX (set in MIMoMuscleEnv) is too large the motors may overcome the joint locking entirely, leading to NaNs and associated errors. In this case adjust the initial FMAX downwards.
- Parameters
env_name (str) – The name of the environment to be used for the calibration. Must use the muscle model.
save_dir (str) – The directory where the final VMAX and any plots will be saved.
n_iterations (int) – How many iterations of the calibration to perform. Default 3.
make_plots (bool) – If
Truewe plot muscle parameters during the last iteration. DefaultTrue.
- Returns
A numpy array with the final FMAX values.
- Return type
np.ndarray
- mimoActuation.muscle_testing.create_joint_plots(plot_dir, data, dt=None)
Creates a series of plots for muscles data.
This function is designed to be used with
collect_data_for_actuator(). The data argument should be a dictionary with the data for each actuator saved as an array with the actuator name as the dictionary key. The structure of the array should have steps or time as the first dimension and the different return values as the second.
- mimoActuation.muscle_testing.average_left_right(env, array)
Averages an array with actuator values between left and right side actuators of MIMo.
Actuators without symmetric versions are left as is.
- Parameters
env (mimoEnv.envs.MIMoEnv) – A MIMo environment.
array (np.ndarray) – An array with actuator values. Note that the first dimension must have the same size as the number of MIMo actuators in the environment.
- Returns
The averaged array.
- Return type
np.ndarray
- mimoActuation.muscle_testing.plotting_episode(env_name, save_dir)
Performs a single episode, saving and creating joint value plots.
We randomize action inputs to either maximum or minimum values every 200 steps.
- mimoActuation.muscle_testing.recording_episode(env_name, video_dir, env_params, video_width=500, video_height=500, camera_name=None, make_joint_plots=True, binary_actions=False, interactive=False)
Perform a single episode, saving joint data and creating a video recording.
We randomize action inputsevery 200 steps.
- Parameters
env_name (str) – The environment to use.
video_dir (str) – The directory where the video and any plots will be saved.
env_params (Dict) – A dictionary with parameters to the environment. Keys are parameter names.
video_width (int) – The width of the rendered video. Default 500.
video_height (int) – The height of the rendered video. Default 500.
camera_name (str|None) – The name of the camera to use for the video. If
None, the MuJoCo freecam is used (camera ID -1). DefaultNone.make_joint_plots (bool) – If
Truewe also save plots of joint and muscle parameters over time. DefaultTrue.binary_actions (bool) – If
True, actions are randomized to be minimal or maximal. DefaultFalse.interactive (bool) – If
True, an interactive window is also rendered. DefaultFalse.
- mimoActuation.muscle_testing.compliance_test()
Performs the compliance test from the paper.
- mimoActuation.muscle_testing.calibrate_full(save_dir, n_fmax=3, n_vmax=30, n_episodes_per_it=50, n_episodes_video=5, lr_initial=0.1, lr_decay=0.7, fmax_scene='MIMoMuscleStaticTest-v0', vmax_scene='MIMoVelocityMuscleTest-v0', video_scene=None)
Determine muscle parameters for a given model.
Performs FMAX and VMAX calibrations. Afterward some scenes can be recorded to video with the new muscle parameters. Note that the parameter calibration requires specialised scenes, see the documentation for
fmax_calibration()andvmax_calibration()for more information.- Parameters
save_dir (str) – The directory where output files and subdirectories will be created.
n_fmax (int) – The number of iterations for the FMAX calibration. Default 3.
n_vmax (int) – The number of iterations for the VMAX calibration. Default 20.
n_episodes_per_it (int) – The number of episodes for each VMAX iteration. Default 20.
n_episodes_video (int) – After calibration, this many episodes will be recorded to video using the new parameters.
lr_initial (float) – The initial learning rate for the VMAX iteration. Default 0.1.
lr_decay (float) – Decay factor after each VMAX iteration. Default 0.7.
fmax_scene (str) – The environment, by name, to use for the FMAX calibration.
vmax_scene (str) – The environment, by name, to use for the VMAX calibration.
video_scene (str) – The environment, by name, that will be used to record videos. If
None, no video is recorded. DefaultNone.
- Returns
The new FMAX and VMAX parameters.
- Return type
Tuple[np.ndarray, np.ndarray]
- mimoActuation.muscle_testing.repeatability_test(save_dir, n_fmax=3, n_vmax=30, n_episodes_per_it=50, n_episodes_video=5, lr_initial=0.1, lr_decay=0.7, fmax_scene='MIMoMuscleStaticTest-v0', vmax_scene='MIMoVelocityMuscleTest-v0', video_scene=None, n_repeats=3)
Performs multiple full calibrations and compares the results against one another for repeatability.
- Parameters
save_dir (str) – The directory where output files and subdirectories will be created.
n_fmax (int) – The number of iterations for the FMAX calibration. Default 3.
n_vmax (int) – The number of iterations for the VMAX calibration. Default 30.
n_episodes_per_it (int) – The number of episodes for each VMAX iteration. Default 50.
n_episodes_video (int) – After calibration, this many episodes will be recorded to video using the new parameters.
lr_initial (float) – The initial learning rate for the VMAX iteration. Default 0.1.
lr_decay (float) – Decay factor after each VMAX iteration. Default 0.7.
fmax_scene (str) – The environment, by name, to use for the FMAX calibration.
vmax_scene (str) – The environment, by name, to use for the VMAX calibration.
video_scene (str) – The environment, by name, that will be used to record videos. If
None, no video is recorded. DefaultNone.n_repeats (int) – The number of repetitions.
- mimoActuation.muscle_testing.make_flfvfp_plots()
Creates a set of plots to show the FL, FV and FP curves.