MIMoEnv base class 

This module defines the base MIMo environment.

The abstract base class is MIMoEnv. Default parameters for all the sensory modalities are provided as well.

Contents

MIMoEnv base class
- MIMoEnv
- Default data fields

MIMoEnv 

class mimoEnv.envs.mimo_env.MIMoEnv(model_path, initial_qpos=None, frame_skip=2, render_mode=None, camera_id=None, camera_name=None, width=DEFAULT_SIZE, height=DEFAULT_SIZE, default_camera_config=None, proprio_params=None, touch_params=None, vision_params=None, vestibular_params=None, actuation_model=SpringDamperModel, goals_in_observation=True, done_active=False)

Bases: Generic[gymnasium.core.ObsType, gymnasium.core.ActType]

This is the abstract base class for all MIMo experiments.

This class meets the interface requirements for basic gym classes and adds some additional features. The observation space is of dictionary type.

Sensory modules are configured by a parameter dictionary. Default configuration dictionaries are included in the same module as this class, DEFAULT_PROPRIOCEPTION_PARAMS, DEFAULT_TOUCH_PARAMS DEFAULT_VISION_PARAMS, DEFAULT_VESTIBULAR_PARAMS. Passing these to the constructor will enable the relevant sensory module. Not passing a dictionary disables the relevant module. By default, all sensory modalities are disabled and the only sensor outputs are the relative joint positions. Actuation models can also be changed using the actuation_model constructor argument. They do not use a configuration dictionary, instead deriving all required parameters from the XMLs.

Implementing subclasses will have to override the following functions: - is_success(), to determine when an episode reaches a success terminal state. - is_failure(), to determine when an episode reaches a failure terminal state. - is_truncated(), to determine when an episode ends for other reasons, such as a time limit or out of

bounds condition.

compute_reward(), to compute the reward for at each step.
reset_model(), which resets the physical simulation. If you wish to randomize some aspect of the scene this function is the place to implement that.
sample_goal(), which should determine the desired end state.
get_achieved_goal(), which should return the achieved end state.

Depending on the requirements of your experiment any of these functions may be implemented as dummy functions returning fixed values. Additional functions that may be overridden optionally are:

_is_done(), which determines the ‘terminal’ and ‘truncated’ return values after each step.
_proprio_setup(), _touch_setup(), _vision_setup(), _vestibular_setup(), these functions initialize the associated sensor modality. These should be overridden if you want to replace the default implementation. Default implementations are SimpleProprioception, DiscreteTouch, SimpleVision, SimpleVestibular.
get_proprio_obs(), get_touch_obs(), get_vision_obs(), get_vestibular_obs(), these functions collect the observations of the associated sensor modality. These allow you to do post-processing on the output without having to alter the base implementations.
_step_callback() and _substep_callbock(), which are called after every environment and simulation step respectively.

These functions come with default implementations that should handle most scenarios.

Parameters

model_path (str) – The path to the scene xml.
initial_qpos (Dict[str, float]|None) – A dictionary of the initial joint positions. Keys are the joint names, with joint positions in radians as values. None by default.
frame_skip (int) – The number of physics substeps for each simulation step. The duration of each physics step is set in the scene XML. Default 2.
render_mode (str|None) – The render mode for gymnasium functions. We support “human”, “rgb_array” and “depth_array”. In mode “human”, the environment can be viewed with an interactive viewer. In modes “rgb_array” and “depth_array”, color images and depths images are rendered and returned. Please see the gymnasium documentation for more details.
camera_id (int) – The camera, by ID, which will be used for rendering.
camera_name (str) – The camera, by name, which will be used for rendering.
width (int) – The width of the rendered image.
height (int) – The height of the rendered image.
proprio_params (Dict|None) – The configuration dictionary for the proprioceptive system. If None the module is disabled. Default None.
touch_params (Dict|None) – The configuration dictionary for the touch system. If None the module is disabled. Default None.
vision_params (Dict|None) – The configuration dictionary for the vision system. If None the module is disabled. Default None.
vestibular_params (Dict|None) – The configuration dictionary for the vestibular system. If None the module is disabled. Default None.
actuation_model (Type[ActuationModel]) – Class for the actuation model. Default is SpringDamperModel. Note that this must be a class, not an instance.
goals_in_observation (bool) – If True the desired and achieved goals are included in the observation dictionary. Default True.
done_active (bool) – If True, _is_done() returns True if the simulation reaches a success or failure state. If False, _is_done() always returns False and the function calling step() has to figure out when to stop or reset the simulation on its own.

model

The MuJoCo model object.

Type: MjModel

data

The MuJoCo data object.

Type: MjData

init_qpos

The initial position vector for the entire scene. Can be used with set_state() to return the simulation to its initial state.

Type: np.ndarray

init_qvel

The initial velocity vectors for the whole scene. Can be used with set_state() to return the simulation to its initial state.

Type: np.ndarray

frame_skip: The number of simulation substeps for each environment step.

goal

The desired goal.

Type: object

action_space

The action space. See Gym documentation for more.

Type: gym.spaces.Space

observation_space

The observation space. See Gym documentation for more.

Type: gym.spaces.Space

actuation_model

Reference to the actuation model instance.

Type: ActuationModel

proprio_params

The configuration dictionary for the proprioceptive system.

Type: Dict

touch_params

The configuration dictionary for the touch system.

Type: Dict

vision_params

The configuration dictionary for the vision system.

Type: Dict

vestibular_params

The configuration dictionary for the vestibular system.

Type: Dict

proprioception

A reference to the proprioception instance.

Type: Proprioception

touch

A reference to the touch instance.

Type: Touch

vision

A reference to the vision instance.

Type: Vision

vestibular

A reference to the vestibular instance.

Type: Vestibular

facial_expressions

A dictionary linking emotions with their associated facial textures. The keys of this dictionary are valid inputs for swap_facial_expression().

Type: Dict[str, int]

goals_in_observation

If True the desired and achieved goals are included in the observation dictionary. Default True.

Type: bool

done_active

If True, _is_done() returns True if the simulation reaches a success or failure state. If False, _is_done() always returns ``False` and the function calling step() has to figure out when to stop or reset the simulation on its own.

Type: bool

camera_id

The camera, by ID, which will be used to render images.

Type: int

camera_name

The camera, by name, which will be used to render images.

Type: str

render_mode

The render mode for basic calls to render().

Type: str

_initialize_simulation(): Initialize MuJoCo simulation data structures mjModel and mjData.

property n_actuators

The number of actuators for MIMo.

Returns: The number of actuators for MIMo.
Return type: int

_get_actuators(): Saves IDs of the actuators associated with MIMo in mimo_actuators.

_get_joints(): Saves the IDs of the joints associated with MIMO in mimo_joints.

_set_action_space()

Sets the action space attribute.

By default, the actuation space contains only MIMos actuators.

_set_observation_space()

Sets the observation space attribute.

Calls _get_obs() and determines the space using the returned observations.

_get_facial_expressions(emotion_textures)

Associates facial textures in the model with human-readable names for the associated emotions.

Parameters: emotion_textures (Dict[str, str]) – A dictionary with names for emotions as keys and the XML names of the associated facial textures as values.

_env_setup()

This function initializes all the sensory components of the model.

Calls the setup functions for all the sensory components.

_set_initial_position(initial_qpos)

Sets the initial positions for joints in the environment.

The input should be a dictionary with joint names as keys and joint positions (in radians as floats) as values. Thin function then sets each listed joint to the corresponding position. Joints not contained in the dictionary are left unaltered.

Parameters: initial_qpos (dict[str, float]) – A dictionary with joint names as keys and joint positions (in radians as floats) as values.

proprio_setup(proprio_params)

Perform the setup and initialization of the proprioceptive system.

This should be overridden if you want to use another implementation!

Parameters: proprio_params (dict) – The parameter dictionary.

touch_setup(touch_params)

Perform the setup and initialization of the touch system.

This should be overridden if you want to use another implementation!

Parameters: touch_params (dict) – The parameter dictionary.

vision_setup(vision_params)

Perform the setup and initialization of the vision system.

This should be overridden if you want to use another implementation!

Parameters: vision_params (dict) – The parameter dictionary.

vestibular_setup(vestibular_params)

Perform the setup and initialization of the vestibular system.

This should be overridden if you want to use another implementation!

Parameters: vestibular_params (dict) – The parameter dictionary.

_single_mujoco_step()

_set_action(action)

Set the action for the next step.

Calls the actuation models function mimoActuation.actuation.ActuationModel.action(). What exactly happens depends on the specific implementation.

Parameters: action (numpy.ndarray) – A numpy array with control values.

do_simulation(action, n_frames)

Step simulation forward for n_frames number of steps.

Parameters

action (np.ndarray) – The control input for the actuators.
n_frames (int) – The number of physics steps to perform.

step(action)

Run one timestep of the environment’s dynamics.

This function takes a simulation step with the given control inputs, collects the observations, computes the reward and finally determines if we are done with this episode or not. _get_obs() collects the observations, compute_reward() calculates the reward.`:meth:._is_done is called to determine if we have reached a terminal state and _step_callback() can be used for extra functions each step, such as incrementing a step counter. Both the ‘terminated’ and ‘truncated’ return values are determined by :meth:._is_done`.

Parameters

action (np.ndarray) – An action provided by the agent

Returns

this will be an element of the environment’s observation_space.: This may, for instance, be a numpy array containing the positions and velocities of certain objects.

reward (float): The amount of reward returned as a result of taking the action. terminated (bool): whether a terminal state (success or failure as defined under the MDP of the task) is

reached. In this case further step() calls could return undefined results.

truncated (bool): whether a truncation condition outside the scope of the MDP is satisfied.: Typically a timelimit, but could also be used to indicate agent physically going out of bounds. Can be used to end the episode prematurely before a terminal state is reached.
info (dictionary): info contains auxiliary diagnostic information (helpful for debugging, learning, and: logging). This might, for instance, contain: metrics that describe the agent’s performance state, variables that are hidden from observations, or individual reward terms that are combined to produce the total reward.

Return type

observation (object)

_step_callback()

A custom callback that is called after stepping the simulation, but before collecting observations.

Useful to enforce additional constraints on the simulation state before observations are collected. Note that the sensory modalities do not update until get_obs is called, so they will not have updated to the current timestep.

_substep_callback(): A custom callback that is called after each simulation substep.

_obs_callback()

A custom callback that is called after collecting the observations.

Like _step_callback, but with up-to-date observations.

_reset_simulation(): Resets MuJoCo and actuation simulation data and samples a new goal.

get_proprio_obs()

Collects and returns the outputs of the proprioceptive system.

Override this function if you want to make some simple post-processing!

Returns: A numpy array containing the proprioceptive output.
Return type: numpy.ndarray

get_touch_obs()

Collects and returns the outputs of the touch system.

Override this function if you want to make some simple post-processing!

Returns: A numpy array containing the touch output.
Return type: numpy.ndarray

get_vision_obs()

Collects and returns the outputs of the vision system.

Override this function if you want to make some simple post-processing!

Returns: A dictionary with one entry for each separate image. In the default implementation each eye renders one image, so each eye gets one entry.
Return type: dict[str, np.ndarray]

get_vestibular_obs()

Collects and returns the outputs of the vestibular system.

Override this function if you want to make some simple post-processing!

Returns: A numpy array with the vestibular data.
Return type: numpy.ndarray

_get_obs()

Returns the observation.

This function should return all simulation outputs relevant to whatever learning algorithm you wish to use. We always return proprioceptive information in the ‘observation’ entry, and this information always includes relative joint positions. Other sensory modalities get their own entries, if they are enabled. If goals_in_observation is set to True, the achieved and desired goal are also included.

Returns: A dictionary containing simulation outputs with separate entries for each sensor modality.
Return type: Dict

swap_facial_expression(emotion)

Changes MIMos facial texture.

Valid emotion names are in facial_expression, which links readable emotion names to their associated texture ids.

Parameters: emotion (str) – A valid emotion name.

_is_done(achieved_goal, desired_goal, info)

This function should determine if we reached the end of an episode. Dummy implementation.

By default, this function always returns False. If done_active is set to True, instead returns True if either is_success() or is_failure() return True. The goal parameters are there to allow this class to be more easily overridden by subclasses, should this be required. They are ignored by default.

Parameters

achieved_goal (object) – The goal that was achieved during execution.
desired_goal (object) – The desired goal that we asked the agent to attempt to achieve.
info (dict) – An info dictionary with additional information.

Returns

Whether the current episode reached a success or failure state. truncated (bool): Whether the current episode entered some kind of invalid condition or “finished” due to

some other constraint, such as a time limit.

Return type

terminated (bool)

action_space: spaces.Space[ActType]

observation_space: spaces.Space[ObsType]

is_success(achieved_goal, desired_goal)

Indicates if the achieved goal matches the desired goal.

Parameters

achieved_goal (object) – The goal that was achieved during execution.
desired_goal (object) – The desired goal that we asked the agent to attempt to achieve.

Returns

If we successfully reached the desired goal state.

Return type

bool

is_failure(achieved_goal, desired_goal)

Indicates that we reached a failure state.

Parameters

achieved_goal (object) – The goal that was achieved during execution.
desired_goal (object) – The desired goal that we asked the agent to attempt to achieve.

Returns

If we reached an unrecoverable failure state.

Return type

bool

is_truncated()

Indicates that we reached an ending condition other than a success or failure state, such as a time limit.

Returns: If we reached some ending condition other than a terminal state.
Return type: bool

reset_model()

This function should reset the simulation state and return observations for the post-reset state.

Returns: The observations after reset.
Return type: Dict

sample_goal()

Should sample a new goal and return it.

Returns: The desired end state.
Return type: object

get_achieved_goal()

Should return the goal that was achieved during the simulation.

Returns: The achieved end state.
Return type: object

compute_reward(achieved_goal, desired_goal, info)

Compute the step reward.

This externalizes the reward function and makes it dependent on a desired goal and the one that was achieved. If you wish to include additional rewards that are independent of the goal, you can include the necessary values to derive it in info and compute it accordingly.

Parameters

achieved_goal (object) – the goal that was achieved during execution
desired_goal (object) – the desired goal that we asked the agent to attempt to achieve
info (dict) – an info dictionary with additional information

Returns

The reward that corresponds to the provided achieved goal w.r.t. to the desired goal. Note that the following should always hold true:

ob, reward, done, info = env.step()

assert reward == env.compute_reward(ob[‘achieved_goal’], ob[‘desired_goal’], info)

Return type

float

Default data fields 

mimoEnv.envs.mimo_env.SCENE_DIRECTORY

Path to the scene directory.

mimoEnv.envs.mimo_env.EMOTES

Valid facial expressions.

mimoEnv.envs.mimo_env.DEFAULT_PROPRIOCEPTION_PARAMS

Default parameters for proprioception. Relative joint positions are always included.

mimoEnv.envs.mimo_env.DEFAULT_TOUCH_PARAMS

Default touch parameters.

mimoEnv.envs.mimo_env.DEFAULT_TOUCH_PARAMS_V2

Default touch parameters for the v2 version of MIMo with five fingers and two toes.

mimoEnv.envs.mimo_env.DEFAULT_VISION_PARAMS

Default vision parameters.

mimoEnv.envs.mimo_env.DEFAULT_VESTIBULAR_PARAMS

Default vestibular parameters.

MIMoEnv base class

MIMoEnv

Default data fields

MIMoEnv base class 

MIMoEnv 

Default data fields 