Vision system

mimoVision.vision.Vision

Abstract base class for vision.

mimoVision.vision.SimpleVision

A simple vision system with one camera for each output.

mimoVision.vision

This module defines the vision interface and provides a simple implementation.

The interface is defined as an abstract class in Vision. A simple implementation treating each eye as a single camera is in SimpleVision.

class mimoVision.vision.Vision(env, camera_parameters)

Bases: object

Abstract base class for vision.

This class defines the functions that all implementing classes must provide. The constructor takes two arguments: env, which is the environment we are working with, and camera_parameters, which can be used to supply implementation specific parameters.

There is only one function that implementations must provide: get_vision_obs() should produce the vision outputs that will be returned to the environment. These outputs should also be stored in sensor_outputs.

env

The environment to which this module will be attached

Type

MujocoEnv

camera_parameters

A dictionary containing the configuration. The exact from will depend on the specific implementation.

sensor_outputs

A dictionary containing the outputs produced by the sensors. Shape will depend on the specific implementation. This should be populated by get_vision_obs()

get_vision_obs()

Produces the current vision output.

This function should perform the whole sensory pipeline and return the vision output as defined in camera_parameters. Exact return value and functionality will depend on the implementation, but should always be a dictionary containing images as values.

Returns

A dictionary of numpy arrays with the output images.

Return type

Dict[str, np.ndarray]

class mimoVision.vision.SimpleVision(env, camera_parameters)

Bases: mimoVision.vision.Vision

A simple vision system with one camera for each output.

The output is simply one RGB image for each camera in the configuration. The constructor takes two arguments: env, which is the environment we are working with, and camera_parameters, which provides the configuration for the vision system. The parameter camera_parameters should be a dictionary with the following structure:

{
    'camera_name': {'width': width, 'height': height},
    'other_camera_name': {'width': width, 'height': height},
}

The default MIMo model has two cameras, one in each eye, named eye_left and eye_right. Note that the cameras in the dictionary must exist in the scene xml or errors will occur!

env

The environment to which this module should be attached

camera_parameters

A dictionary containing the configuration.

sensor_outputs

A dictionary containing the outputs produced by the sensors. This is populated by get_vision_obs()

get_vision_obs()

Produces the current vision output.

This function renders each camera with the resolution as defined in camera_parameters using an off-screen render context. The images are also stored in sensor_outputs under the name of the associated camera.

Returns

A dictionary with camera names as keys and the corresponding rendered images as values.

Return type

Dict[str, np.ndarray]

save_obs_to_file(directory, suffix='')

Saves the output images to file.

Everytime this function is called all images in sensor_outputs are saved to separate files in directory. The filename is determined by the camera name and suffix. Saving large images takes a long time!

Parameters
  • directory (str) – The output directory. It will be created if it does not already exist.

  • suffix (str) – Optional file suffix. Useful for a step counter. Empty by default.