Actuation Models

We currently have two models with different runtime vs accuracy trade-offs. Both approaches use the MuJoCo actuators belonging to MIMo, but have different internal approaches to determining the output torque of these actuators. The first, SpringDamperModel, uses a spring-damper system to approximate force-length and force-velocity relationships. The second, MuscleModel, models each actuator as two opposing, independently controllable muscles. Compared to each other, the Spring-Damper Model is faster but less accurate, especially with regards to compliance. In addition there is a “Positional” actuation model, which can be used to pose MIMo. The input action in this case is an array of joint angles into which MIMo’s joints are locked.

`mimoActuation.actuation.ActuationModel`	Abstract base class for MIMo's actuation model.
`mimoActuation.actuation.SpringDamperModel`	Class for the Spring-Damper actuation model.
`mimoActuation.actuation.PositionalModel`	This model allows posing MIMo or moving his joints along pre-determined trajectories .
`mimoActuation.muscle_testing`	Functions in this model are designed to help transition actuation models for MIMo.

mimoActuation.actuation

This module defines the actuation model interface and provides two implementations.

The interface is defined as an abstract class in ActuationModel. The spring-damper model is defined in TorqueMotorModel. A second implementation using direct positional control is PositionalModel.

class mimoActuation.actuation.ActuationModel(env, actuators, *args)

Bases: object

Abstract base class for MIMo’s actuation model.

This class defines the functions that all implementing classes must provide.

Control inputs have two conceptual levels: The desired control input (i.e. maximum output in one direction), and the actual control input to the simulation motors. In the simulation the motor response is linear and instantaneous, but this may not be desired. Actuation models can model time-dependent or non-linear torque generation by taking the desired control input and altering it before passing it to the simulation. Actuation models can define an arbitrary control method, but must compute control inputs for the actual simulation motors as defined in the XMLs.

The key functions are:

get_action_space() determines the actuation space attribute for the gym environment. This should have the shape of the input to the abstract model motors.
action() computes the actual control inputs to the simulation motors from a control input to the abstract motors. substep_update() is called on every physics step and allows torques to be updated between environment steps.
observations() should return any actuation-related quantities that could reasonably be used as observations for the gym environment. Note that these will only actually be included if the proprioception module is appropriately configured.
cost() should return the cost of the current activations. This can represent the metabolic cost or an action penalty. This function is not used by default, but environments may use it as they wish, for example during reward calculation.
reset() should reset whatever internal quantities the model uses to the value at the start of the simulation.

Parameters

env (MIMoEnv) – The environment to which this model will be attached.
actuators (np.ndarray) – An array with the actuators, by ID, to include in this model.

env

The environment to which this module will be attached.

Type: gym.Env

actuators

The simulation motors, by ID, to include in this model.

Type: np.ndarray

n_actuators

The number of actuators controlled by this model.

Type: int

action_space

The action space for this model. This is set by get_action_space()

Type: spaces.Box

get_action_space()

Determines the actuation space attribute for the gym environment.

Note that his action space must be a Box!

Returns: A gym spaces object with the actuation space.
Return type: gym.spaces.Box

action(action)

Converts abstract control inputs into actual motor inputs.

This function is called during every environment step and sets the actual motor inputs for each included actuator.

Parameters: action (numpy.ndarray) – A numpy array with control values.

substep_update()

Like action, but called on every physics step instead of every environment step.

This allows for torques to be updated every physics step.

observations()

Collect any quantities for the observations.

Returns: A flat numpy array with these quantities.
Return type: np.ndarray

cost()

Returns the “cost” of the current action.

This function may be used as an action penalty.

Returns: The cost of the action.
Return type: float

reset(): Reset actuation model to the initial state.

class mimoActuation.actuation.SpringDamperModel(env, actuators)

Bases: mimoActuation.actuation.ActuationModel

Class for the Spring-Damper actuation model.

In this model, MIMo’s muscles are represented by torque motors with linear and instantaneous control response, i.e. the abstract model directly matches the in-simulation definitions. The force-velocity and force-length relationships of real muscles is approximated using damping and spring components in the joint definitions of MIMo. The maximum torque of the motors is set to the maximum voluntary isometric torque along the corresponding axis, with a control input of 1 representing maximum torque.

In addition to the attributes from the base actuation class, there are two extra attributes:

control_input

Contains the current control input.

Type: np.ndarray

max_torque

The maximum motor torques.

Type: np.ndarray

get_action_space()

Determines the actuation space attribute for the gym environment.

The actuation space directly corresponds to the control range of the simulations motors. Unless modified, this will be [-1, 1] for all motors.

Returns: The actuation space.
Return type: gym.spaces.Space

action(action)

Set the control inputs for the next step.

Control values are clipped to the control range limits defined the MuJoCo XMLs and normalized to be even in both directions, i.e. an input of 0 corresponds to the center of the control range, rather than the default or neutral control position. The control ranges for the MIMo XMLs are set up to be symmetrical, such that an input of 0 corresponds to no motor torque.

Parameters: action (numpy.ndarray) – A numpy array with control values.

observations()

Control input and output torque for each motor at this time step.

Returns: A flat array with the control inputs and output torques.
Return type: np.ndarray

cost()

Provides a cost function for current motor usage.

The cost is given by given by \(\sum_{i=1}^n \frac{u_i^2 * T_{max_i}}{n \sum_{i=1}^n T_{max_i}}\), where \(u_i\) and \(T_{max_i}\) are the control signal and maximum motor torque of motor \(i\), respectively, and \(n\) is the number of motors in the model.

Returns: The cost as described above.
Return type: float

simulation_torque()

Computes the currently applied torque for each motor in the simulation.

Returns: An array with applied torques for each motor.
Return type: np.ndarray

reset(): Reset actuation model to the initial state.

class mimoActuation.actuation.PositionalModel(env, actuators)

Bases: mimoActuation.actuation.ActuationModel

This model allows posing MIMo or moving his joints along pre-determined trajectories .

The ‘action’ input represents desired joint positions. MIMo will be locked into these at each timestep. Unlike the other actuation models this doesn’t use the MuJoCo actuators in the scene but instead adjusts the equality constraints used to lock each joint into position. To determine which joints should be included we use the joints associated with the actuators in the ‘actuators’ parameter. Note that this requires that there is an equality constraint in the XMLs for each actuated joint. This is true for MIMo by default.

In addition to the attributes from the base actuation class, there is three extra attributes.

control_input

Contains the current control input.

Type: np.ndarray

actuated_joints

Contains an array of joint IDs associated with the actuators.

Type: np.npdarray

constraints

Contains an array of constraint IDs belonging to the joints in ‘actuated_joints’.

Type: np.ndarray

get_constraints()

Collects the constraints associated with the actuated joints in the scene.

Returns: An array with the constraint IDs.
Return type: np.ndarray

get_action_space()

Determines the actuation space attribute for the gym environment.

The actuation space directly corresponds to the range of motion of the joints in radians.

Returns: A gym spaces object with the actuation space.
Return type: gym.space.Spaces

action(action)

Locks the joints into the positions provided by ‘action’.

Control values are clipped to the joint range of motion.

Parameters: action (numpy.ndarray) – A numpy array with desired joint positions.

observations()

Returns the current control input, i.e. the locked positions.

Returns: A flat numpy array with the control input.
Return type: np.ndarray

reset(): Reset actuation model to the initial state.

cost()

Dummy function.

Returns: Always returns 0.
Return type: float

mimoActuation.muscle

This module defines the base class for the muscle actuation model.

Authors: Pierre Schumacher, Dominik Mattern

class mimoActuation.muscle.MuscleModel(env, actuators)

Bases: mimoActuation.actuation.ActuationModel

Class for the muscle actuation model.

Implementation of the muscle model as seen in https://arxiv.org/abs/2207.03952. Each actuator is internally modeled as two opposing muscles. These follow the force-length and force-velocity curves as described in the paper. Torque is applied in the simulation by setting the gear ratio to the computed output torque and applying a dummy control signal of 1. There are many parameters in this model, two of which were tweaked for MIMo specifically. The function used for this is calibrate_full().

This model loads and modifies data from the actuators and joints the MIMo XML, which are effectively part of the specifications for this model. Changing them before this model is initialized might have unintended consequences for the actuation!

lmax

Determines the shape of the force-length curve.

Type: float

lmin

Determines the shape of the force-length curve.

Type: float

fvmax

The highest multiplier due to the force-velocity curve.

Type: float

fpmax

Multiplier for the passive force component.

Type: float

lce_min

Minimum virtual muscle length.

Type: float

lce_max

Maximum virtual muscle length.

Type: float

tau

Time constant for the activity. A higher tau means muscle activity takes longer to build up to the control signal.

Type: float

fmax

Force multiplier to translate the normalised force-length and force-velocity curves into appropriate ranges.

Type: float|np.ndarray

vmax

Reference velocity for the force-velocity curve. A higher vmax leads to increased force at high virtual muscle velocities.

Type: float|np.ndarray

target_activity

Current control input. activity will approach this value over time.

Type: np.ndarray

activity

Muscle activity.

Type: np.ndarray

get_action_space()

Determines the actuation space attribute for the gym environment.

The actuation space consists of two opposing muscles for each motor in the simulation, each with range [0, 1].

Returns: A gym spaces object with the actuation space.
Return type: spaces.Space

action(action)

Set the control inputs for the next step.

Input values are clipped to the action space.

Parameters: action (numpy.ndarray) – A numpy array with control values.

substep_update()

Update muscle activity and torque.

As activity is time-dependent we update activity and the output torque every physics step. The desired activity level (input action) is not changed during this.

observations()

Returns muscle activations and forces for every actuator.

Returns: A flat array with the quantities described above.
Return type: np.ndarray

cost()

Approximates the metabolic cost of muscle activations.

Currently, it is given by \(\sum_{i=1}^n \frac{m_{a_i}^2 * f_{max_i}}{n \sum_{i=1}^n f_{max_i}}\), where \(m_{a_i}\) and \(f_{max_i}\) are the activation and the maximum isometric muscle force of muscle \(i\), respectively, and \(n\) is the number of muscles in the model.

Returns: The actuation cost.
Return type: float

reset(): Set activity to zero and recompute muscle quantities.

property muscle_activations

Activity for every muscle.

Returns: An array with copies of the activity for every muscle.
Return type: np.ndarray

property muscle_lengths

Virtual muscle lengths for all muscles.

Returns: An array with copies of the virtual muscle lengths.
Return type: np.ndarray

property muscle_velocities

Virtual muscle speeds for all muscles.

Returns: An array with copies of the virtual muscle velocities.
Return type: np.ndarray

property muscle_forces

Muscle force vectors.

Returns: An array with copies of the forces applied by each muscle.
Return type: np.ndarray

fl(lce)

Force length curve as implemented by MuJoCo.

Parameters: lce (np.ndarray) – Virtual muscle lengths for MIMo.
Returns: An array with the force-length multipliers.
Return type: np.ndarray

fv(lce_dot)

Force length curve as implemented by MuJoCo.

Parameters: lce_dot (np.ndarray) – Virtual muscle velocities for MIMo.
Returns: An array with the force-velocity multipliers.
Return type: np.ndarray

fp(lce)

Parallel elasticity (passive muscle force) as implemented by MuJoCo.

Parameters: lce (np.ndarray) – Virtual muscle lengths for MIMo.
Returns: An array with the passive force components.
Return type: np.ndarray

set_fmax(fmax)

Setter for fmax.

Parameters: fmax (np.ndarray|float) – The new fmax value(s).

set_vmax(vmax)

Setter for vmax.

Parameters: vmax (np.ndarray|float) – The new vmax value(s).

simulation_torque()

Computes the currently applied torque for each motor in the simulation.

Returns: A numpy array with applied torques for each motor.
Return type: np.ndarray

collect_data_for_actuators()

Collect all muscle related values at the current timestep for all of MIMo’s actuators.

Returns: A list containing the joint position and velocity, corrected position, output torque, desired target muscle activity, actual current muscle activity, virtual muscle length, virtual muscle velocity, muscle force, FL factor, FV factor and the FP component for all muscles.
Return type: List[np.ndarray]

mimoActuation.muscle.bump(length, a, mid, b)

Part of the force length relationship as implemented by MuJoCo.

The parameters a, mid and b define the shape of the force-length curve. See https://arxiv.org/abs/2207.03952 for more details.

Parameters

length (np.ndarray) – The current virtual muscle lengths.
a (float) – One of the parameters of the force-length equation.
mid (float) – One of the parameters of the force-length equation.
b (float) – One of the parameters of the force-length equation.

Returns

Resulting force-length multiplier.

Return type

np.ndarray

mimoActuation.muscle_testing

Functions in this model are designed to help transition actuation models for MIMo.

By default, MIMo uses direct torque motors for actuation with the maximum torques corresponding to the maximum voluntary isometric torques. A second actuation model exists based on https://arxiv.org/abs/2207.03952. This second model more accurately represents the position and velocity dependent force generating behaviour of real muscles. The second model requires several adjustments to the actuation and joint parameters, which can be done using the functions in this module.

mimoActuation.muscle_testing.vectorized(fn): Simple vector wrapper for functions that clearly came from C.

mimoActuation.muscle_testing.fl(vec)

mimoActuation.muscle_testing.bump(length, a, mid, b)

Part of the force length relationship as implemented by MuJoCo.

The parameters a, mid and b define the shape of the force-length curve. See https://arxiv.org/abs/2207.03952 for more details.

Parameters

length (np.ndarray) – The current virtual muscle lengths.
a (float) – One of the parameters of the force-length equation.
mid (float) – One of the parameters of the force-length equation.
b (float) – One of the parameters of the force-length equation.

Returns

Resulting force-length multiplier.

Return type

np.ndarray

mimoActuation.muscle_testing.fp(vec)

mimoActuation.muscle_testing.fv_vec(lce_dot, vmax)

Force-velocity curve.

Parameters

lce_dot (np.ndarray) – Array with virtual muscle velocities.
vmax (np.ndarray|float) – Array or float with the VMAX value.

Returns

The corresponding force-velocity multipliers.

Return type

np.ndarray

mimoActuation.muscle_testing.force_vel_v_vec(velocity, c, vmax, fvmax)

Force velocity relationship as implemented by MuJoCo.

Parameters

velocity (np.ndarray) – Array with virtual muscle velocities.
c (float) – Virtual velocity at which the curve is 1. Determines the shape of the curve.
vmax (np.ndarray|float) – Scaling factor VMAX. Determines the shape of the curve.
fvmax (float) – Maximum multiplier due to velocity. Determines the shape of the curve.

Returns

The corresponding force-velocity multipliers.

Return type

np.ndarray

mimoActuation.muscle_testing.vmax_calibration(env_name, n_episodes, save_dir, lr=0.1, lr_decay=0.8, decay_lr_every=100, make_plots=True)

Iteratively calibrate VMAX parameters for the muscle model.

We determine VMAX with an iterative procedure. Using an initial value we take random actions and measure the maximum achieved joint velocity. The initial VMAX values are then updated using learning rate lr and we continue with more random actions. The learning rate is updated every decay_lr_every episodes by factor lr_decay. The procedure continues for n_episodes episodes. Optionally VMAX can be plotted for every step by setting make_plots to True. We use the environment as provided by env_name. For MIMo these are fixed environments in which MIMo is hovering in the air with gravity disabled entirely. Muscle actions do not use the full range of inputs, instead we randomly set maximum or minimum inputs with no in between. The final VMAX values are saved to a file “vmax.npy” in the plotting directory.

Parameters

env_name (str) – The name of the environment to be used for the calibration. Must use the muscle model.
n_episodes (int) – The total number of episodes.
save_dir (str) – The directory where the final VMAX and any plots will be saved.
lr (float) – The learning rate used to update VMAX every episode. Default 0.1.
lr_decay (float) – The learning rate is multiplied by this factor every decay_lr_every episodes. Default 0.8.
decay_lr_every (int) – How often the learning rate is updated. Default 100.
make_plots (bool) – If True we plot the change in VMAX over time and save as a file in the plotting directory. Default True.

Returns

A numpy array with the final VMAX values.

Return type

np.ndarray

mimoActuation.muscle_testing.fmax_calibration(env_name, save_dir, n_iterations=3, make_plots=True)

Calibrate FMAX parameters for the muscle model.

The calibration procedure is as follows: We take the desired maximum force values from the actuator definitions in the scene XML. We then apply maximum control input in one direction for 500 steps, back off for 500 steps, and then maximum input in the opposite direction for 500 steps. The maximum torque actually generated during this is recorded for each direction and compared against the desired values. The FMAX parameter is then adjusted such that the generated and desired torques match. This is performed iteratively as all MuJoCo constraints are soft and even locked joints will change position slightly based on applied torque.

This method requires a specialised scene to measure maximum voluntary isometric muscle torque in which all joints are locked in the angle at which the torque is to be measured. Note also that if the initial FMAX (set in MIMoMuscleEnv) is too large the motors may overcome the joint locking entirely, leading to NaNs and associated errors. In this case adjust the initial FMAX downwards.

Parameters

env_name (str) – The name of the environment to be used for the calibration. Must use the muscle model.
save_dir (str) – The directory where the final VMAX and any plots will be saved.
n_iterations (int) – How many iterations of the calibration to perform. Default 3.
make_plots (bool) – If True we plot muscle parameters during the last iteration. Default True.

Returns

A numpy array with the final FMAX values.

Return type

np.ndarray

mimoActuation.muscle_testing.create_joint_plots(plot_dir, data, dt=None)

Creates a series of plots for muscles data.

This function is designed to be used with collect_data_for_actuator(). The data argument should be a dictionary with the data for each actuator saved as an array with the actuator name as the dictionary key. The structure of the array should have steps or time as the first dimension and the different return values as the second.

Parameters

plot_dir (str) – The directory where the plots will be saved.
data (Dict[str, np.ndarray]) – A dictionary containing the actuator data.
dt (float|None) – The time between data points. If not None the x-axis will be time instead of number of data points. Default None.

mimoActuation.muscle_testing.average_left_right(env, array)

Averages an array with actuator values between left and right side actuators of MIMo.

Actuators without symmetric versions are left as is.

Parameters

env (mimoEnv.envs.MIMoEnv) – A MIMo environment.
array (np.ndarray) – An array with actuator values. Note that the first dimension must have the same size as the number of MIMo actuators in the environment.

Returns

The averaged array.

Return type

np.ndarray

mimoActuation.muscle_testing.plotting_episode(env_name, save_dir)

Performs a single episode, saving and creating joint value plots.

We randomize action inputs to either maximum or minimum values every 200 steps.

Parameters

env_name (str) – The name of the environment to use.
save_dir (str) – The directory where the data will be saved.

mimoActuation.muscle_testing.recording_episode(env_name, video_dir, env_params, video_width=500, video_height=500, camera_name=None, make_joint_plots=True, binary_actions=False, interactive=False)

Perform a single episode, saving joint data and creating a video recording.

We randomize action inputsevery 200 steps.

Parameters

env_name (str) – The environment to use.
video_dir (str) – The directory where the video and any plots will be saved.
env_params (Dict) – A dictionary with parameters to the environment. Keys are parameter names.
video_width (int) – The width of the rendered video. Default 500.
video_height (int) – The height of the rendered video. Default 500.
camera_name (str|None) – The name of the camera to use for the video. If None, the MuJoCo freecam is used (camera ID -1). Default None.
make_joint_plots (bool) – If True we also save plots of joint and muscle parameters over time. Default True.
binary_actions (bool) – If True, actions are randomized to be minimal or maximal. Default False.
interactive (bool) – If True, an interactive window is also rendered. Default False.

mimoActuation.muscle_testing.compliance_test(): Performs the compliance test from the paper.

mimoActuation.muscle_testing.calibrate_full(save_dir, n_fmax=3, n_vmax=30, n_episodes_per_it=50, n_episodes_video=5, lr_initial=0.1, lr_decay=0.7, fmax_scene='MIMoMuscleStaticTest-v0', vmax_scene='MIMoVelocityMuscleTest-v0', video_scene=None)

Determine muscle parameters for a given model.

Performs FMAX and VMAX calibrations. Afterward some scenes can be recorded to video with the new muscle parameters. Note that the parameter calibration requires specialised scenes, see the documentation for fmax_calibration() and vmax_calibration() for more information.

Parameters

save_dir (str) – The directory where output files and subdirectories will be created.
n_fmax (int) – The number of iterations for the FMAX calibration. Default 3.
n_vmax (int) – The number of iterations for the VMAX calibration. Default 20.
n_episodes_per_it (int) – The number of episodes for each VMAX iteration. Default 20.
n_episodes_video (int) – After calibration, this many episodes will be recorded to video using the new parameters.
lr_initial (float) – The initial learning rate for the VMAX iteration. Default 0.1.
lr_decay (float) – Decay factor after each VMAX iteration. Default 0.7.
fmax_scene (str) – The environment, by name, to use for the FMAX calibration.
vmax_scene (str) – The environment, by name, to use for the VMAX calibration.
video_scene (str) – The environment, by name, that will be used to record videos. If None, no video is recorded. Default None.

Returns

The new FMAX and VMAX parameters.

Return type

Tuple[np.ndarray, np.ndarray]

mimoActuation.muscle_testing.repeatability_test(save_dir, n_fmax=3, n_vmax=30, n_episodes_per_it=50, n_episodes_video=5, lr_initial=0.1, lr_decay=0.7, fmax_scene='MIMoMuscleStaticTest-v0', vmax_scene='MIMoVelocityMuscleTest-v0', video_scene=None, n_repeats=3)

Performs multiple full calibrations and compares the results against one another for repeatability.

Parameters

save_dir (str) – The directory where output files and subdirectories will be created.
n_fmax (int) – The number of iterations for the FMAX calibration. Default 3.
n_vmax (int) – The number of iterations for the VMAX calibration. Default 30.
n_episodes_per_it (int) – The number of episodes for each VMAX iteration. Default 50.
n_episodes_video (int) – After calibration, this many episodes will be recorded to video using the new parameters.
lr_initial (float) – The initial learning rate for the VMAX iteration. Default 0.1.
lr_decay (float) – Decay factor after each VMAX iteration. Default 0.7.
fmax_scene (str) – The environment, by name, to use for the FMAX calibration.
vmax_scene (str) – The environment, by name, to use for the VMAX calibration.
video_scene (str) – The environment, by name, that will be used to record videos. If None, no video is recorded. Default None.
n_repeats (int) – The number of repetitions.

mimoActuation.muscle_testing.make_flfvfp_plots(): Creates a set of plots to show the FL, FV and FP curves.