Vision system
Abstract base class for vision. |
|
A simple vision system with one camera for each output. |
mimoVision.vision
This module defines the vision interface and provides a simple implementation.
The interface is defined as an abstract class in Vision.
A simple implementation treating each eye as a single camera is in SimpleVision.
- class mimoVision.vision.Vision(env, camera_parameters)
Bases:
objectAbstract base class for vision.
This class defines the functions that all implementing classes must provide. The constructor takes two arguments: env, which is the environment we are working with, and camera_parameters, which can be used to supply implementation specific parameters.
There is only one function that implementations must provide:
get_vision_obs()should produce the vision outputs that will be returned to the environment. These outputs should also be stored insensor_outputs.- env
The environment to which this module will be attached
- Type
MujocoEnv
- camera_parameters
A dictionary containing the configuration. The exact from will depend on the specific implementation.
- sensor_outputs
A dictionary containing the outputs produced by the sensors. Shape will depend on the specific implementation. This should be populated by
get_vision_obs()
- get_vision_obs()
Produces the current vision output.
This function should perform the whole sensory pipeline and return the vision output as defined in
camera_parameters. Exact return value and functionality will depend on the implementation, but should always be a dictionary containing images as values.- Returns
A dictionary of numpy arrays with the output images.
- Return type
Dict[str, np.ndarray]
- class mimoVision.vision.SimpleVision(env, camera_parameters)
Bases:
mimoVision.vision.VisionA simple vision system with one camera for each output.
The output is simply one RGB image for each camera in the configuration. The constructor takes two arguments: env, which is the environment we are working with, and camera_parameters, which provides the configuration for the vision system. The parameter camera_parameters should be a dictionary with the following structure:
{ 'camera_name': {'width': width, 'height': height}, 'other_camera_name': {'width': width, 'height': height}, }
The default MIMo model has two cameras, one in each eye, named eye_left and eye_right. Note that the cameras in the dictionary must exist in the scene xml or errors will occur!
- env
The environment to which this module should be attached
- camera_parameters
A dictionary containing the configuration.
- sensor_outputs
A dictionary containing the outputs produced by the sensors. This is populated by
get_vision_obs()
- get_vision_obs()
Produces the current vision output.
This function renders each camera with the resolution as defined in
camera_parametersusing an off-screen render context. The images are also stored insensor_outputsunder the name of the associated camera.- Returns
A dictionary with camera names as keys and the corresponding rendered images as values.
- Return type
Dict[str, np.ndarray]
- save_obs_to_file(directory, suffix='')
Saves the output images to file.
Everytime this function is called all images in
sensor_outputsare saved to separate files in directory. The filename is determined by the camera name and suffix. Saving large images takes a long time!