Creating your own experiment 

Contents

Creating your own experiment
- The scene XML
- The environment class

In this guide we will create a new experiment using the standup environment as an example. This involves creating a new scene XML and environment class. Creating a new scene will require working with MuJoCo XMLs.

This scenario will have MIMo standing up from a low crouch or sitting position. The scene will contain MIMo and a crib like structure, with MIMos feet welded to the ground and his hands to the crib. Episodes will have fixed length with a reward each step based on the height of MIMos head. We use only the proprioceptive and vestibular sensors.

The first step will be to create the XML for this scene.

The scene XML 

MuJoCo allows for a modular structure by importing other XMLs. We make use of this by having two component XMLs containing the required elements for MIMo, which are imported by the scene XMLs. The scene XMLs are then loaded by the code.

The component XMLs are “MIMo_model.xml”, which contains the kinematic tree, and “MIMo_meta.xml”, which contains the definitions of the actuators, MuJoCo sensors, textures and so forth. These have to be split due to the XML importing process. Both are located in mimoEnv/assets/mimo/.

We start with a stripped down sample XML:

<mujoco model="MIMo">

    <compiler inertiafromgeom="true" angle="degree"/>
    <option timestep="0.005" iterations="50" tolerance="1e-10" solver="Newton" jacobian="dense" cone="elliptic" impratio="1.0"/>
    <size nconmax="1000" njmax="5000" nstack="10000000" nuser_cam="3"/>

    <visual>
        <map force="0.1" zfar="30" znear="0.005"/>
        <rgba haze="0.15 0.25 0.35 1"/>
        <quality shadowsize="4096"/>
        <global offwidth="800" offheight="800"/>
    </visual>

    <asset>
        <texture type="skybox" builtin="gradient" rgb1="0.3 0.5 0.7" rgb2="0 0 0" width="512" height="512"/>
        <texture name="texplane" type="2d" builtin="checker" rgb1=".2 .3 .4" rgb2=".1 0.15 0.2" width="512" height="512" mark="cross" markrgb=".8 .8 .8"/>
        <texture name="texgeom" type="cube" builtin="flat" mark="cross" width="127" height="1278" rgb1="0.8 0.6 0.4" rgb2="0.8 0.6 0.4" markrgb="1 1 1" random="0.01"/>
        <material name="matplane" reflectance="0.3" texture="texplane" texrepeat="1 1" texuniform="true"/>
        <material name="matgeom" texture="texgeom" texuniform="true" rgba="0.8 0.6 .4 1"/>

        <texture name="crib" type="cube" builtin="flat" width="127" height="1278" rgb1="1 0.9 0.8" rgb2="1 1 1" markrgb="1 1 1"/>
        <material name="crib" texture="crib" texuniform="true"/>
    </asset>

    <worldbody>
        <geom name="floor" pos="0 0 0" size="0 0 .25" type="plane" material="matplane" condim="3"/>
        <light directional="false" diffuse=".4 .4 .4" specular="0 0 0" pos="0 0 10" dir="0 0 -1" castshadow="false"/>
    </worldbody>
</mujoco>

The worldbody element will contain the kinematic tree for the whole scene, currently just an infinite floor plane and a light. The other elements define various parameters of the scene, such as the maximum number of contacts in the scene and default textures. See the MuJoCo documentation for detail.

To include MIMo we import the two component XMLs. “MIMo_model.xml” is included in the worldbody element and “MIMo_meta.xml” just above:

<!-- Import everything except the kinematic tree -->
<include file="mimo/MIMo_meta.xml"></include>

<worldbody>
    <geom name="floor" pos="0 0 0" size="0 0 .25" type="plane" material="matplane" condim="3"/>
    <light directional="false" diffuse=".4 .4 .4" specular="0 0 0" pos="0 0 10" dir="0 0 -1" castshadow="false"/>
    <light mode="targetbodycom" target="upper_body" directional="false" diffuse=".8 .8 .8" specular="0.3 0.3 0.3" pos="0 0 5.0" dir="0 0 -1"/>

    <!-- The location and orientation of the base model can be set using this body -->
    <body name="mimo_location" pos="0 0 .33" euler="0 0 0">
        <freejoint/>
        <include file="mimo/MIMo_model.xml"></include> <!-- Import the actual model-->
    </body>
</worldbody>

We also add an extra light tracking MIMos torso.

Fixing MIMos hands and feet in positions can be done by adding equality contraints:

<equality>
    <weld body1="left_foot"  relpose="-0 -0.05 0 0.01 0 0 0"/>
    <weld body1="right_foot" relpose="-0 0.05 0 0.01 0 0 0"/>
    <weld body1="left_fingers"  relpose="0.1 0.1 0.45 0 -0.1 0.1 0 "/>
    <weld body1="right_fingers" relpose="0.1 -0.1 0.45 0 0.1 0.1 0 "/>
    <weld body1="head" body2="upper_body"/>
    <weld body1="left_eye" body2="head"/>
    <weld body1="right_eye" body2="head"/>
</equality>

Finally we add the crib to the scene:

<body name="crib" pos="0.078 0 0.42">
    <geom type="cylinder" material="crib" size="0.02 0.4" pos="0 0 0" euler="90 0 0"/>
    <geom type="cylinder" material="crib" size="0.01 0.2" pos="0 0 -0.2" euler="0 0 0"/>
    <geom type="cylinder" material="crib" size="0.01 0.2" pos="0 -0.2 -0.2" euler="0 0 0"/>
    <geom type="cylinder" material="crib" size="0.01 0.2" pos="0 0.2 -0.2" euler="0 0 0"/>
    <geom type="cylinder" material="crib" size="0.01 0.2" pos="0 -0.4 -0.2" euler="0 0 0"/>
    <geom type="cylinder" material="crib" size="0.01 0.2" pos="0 0.4 -0.2" euler="0 0 0"/>
    <geom type="sphere"   material="crib" size="0.022" pos="0 -0.4 0" euler="0 0 0"/>
    <geom type="sphere"   material="crib" size="0.022" pos="0 0.4 0" euler="0 0 0"/>
</body>

There is still some trimming we can do. Since we do not use vision in this scenario we fixed MIMos eyes and head above. However the actuators are still included in the scene and take up resources. To disable these we replace “MIMo_meta.xml” in our scene with a copy in which we removed those actuators, called “standup_meta.xml”.

This leaves us with our finished scene XML:

<mujoco model="MIMo">

    <compiler inertiafromgeom="true" angle="degree"/>
    <option timestep="0.005" iterations="50" tolerance="1e-10" solver="Newton" jacobian="dense" cone="elliptic" impratio="1.0"/>
    <size nconmax="1000" njmax="5000" nstack="10000000" nuser_cam="3"/>

    <visual>
        <map force="0.1" zfar="30" znear="0.005"/>
        <rgba haze="0.15 0.25 0.35 1"/>
        <quality shadowsize="4096"/>
        <global offwidth="800" offheight="800"/>
    </visual>

    <asset>
        <texture type="skybox" builtin="gradient" rgb1="0.3 0.5 0.7" rgb2="0 0 0" width="512" height="512"/>
        <texture name="texplane" type="2d" builtin="checker" rgb1=".2 .3 .4" rgb2=".1 0.15 0.2" width="512" height="512" mark="cross" markrgb=".8 .8 .8"/>
        <texture name="texgeom" type="cube" builtin="flat" mark="cross" width="127" height="1278" rgb1="0.8 0.6 0.4" rgb2="0.8 0.6 0.4" markrgb="1 1 1" random="0.01"/>
        <material name="matplane" reflectance="0.3" texture="texplane" texrepeat="1 1" texuniform="true"/>
        <material name="matgeom" texture="texgeom" texuniform="true" rgba="0.8 0.6 .4 1"/>

        <texture name="crib" type="cube" builtin="flat" width="127" height="1278" rgb1="1 0.9 0.8" rgb2="1 1 1" markrgb="1 1 1"/>
        <material name="crib" texture="crib" texuniform="true"/>
    </asset>

    <!-- Import everything except the kinematic tree -->
    <include file="standup_meta.xml"></include>

    <equality>
        <weld body1="left_foot"  relpose="-0 -0.05 0 0.01 0 0 0"/>
        <weld body1="right_foot" relpose="-0 0.05 0 0.01 0 0 0"/>
        <weld body1="left_fingers"  relpose="0.1 0.1 0.45 0 -0.1 0.1 0 "/>
        <weld body1="right_fingers" relpose="0.1 -0.1 0.45 0 0.1 0.1 0 "/>
        <weld body1="head" body2="upper_body"/>
        <weld body1="left_eye" body2="head"/>
        <weld body1="right_eye" body2="head"/>
    </equality>

    <worldbody>
        <geom name="floor" pos="0 0 0" size="0 0 .25" type="plane" material="matplane" condim="3"/>
        <light directional="false" diffuse=".4 .4 .4" specular="0 0 0" pos="0 0 10" dir="0 0 -1" castshadow="false"/>
        <light mode="targetbodycom" target="upper_body" directional="false" diffuse=".8 .8 .8" specular="0.3 0.3 0.3" pos="0 0 5.0" dir="0 0 -1"/>

        <!-- The location and orientation of the base model can be set using this body -->
        <body name="mimo_location" pos="0 0 .33" euler="0 0 0">
            <freejoint/>
            <include file="mimo/MIMo_model.xml"></include> <!-- Import the actual model-->
        </body>

        <body name="crib" pos="0.078 0 0.42">
            <geom type="cylinder" material="crib" size="0.02 0.4" pos="0 0 0" euler="90 0 0"/>
            <geom type="cylinder" material="crib" size="0.01 0.2" pos="0 0 -0.2" euler="0 0 0"/>
            <geom type="cylinder" material="crib" size="0.01 0.2" pos="0 -0.2 -0.2" euler="0 0 0"/>
            <geom type="cylinder" material="crib" size="0.01 0.2" pos="0 0.2 -0.2" euler="0 0 0"/>
            <geom type="cylinder" material="crib" size="0.01 0.2" pos="0 -0.4 -0.2" euler="0 0 0"/>
            <geom type="cylinder" material="crib" size="0.01 0.2" pos="0 0.4 -0.2" euler="0 0 0"/>
            <geom type="sphere"   material="crib" size="0.022" pos="0 -0.4 0" euler="0 0 0"/>
            <geom type="sphere"   material="crib" size="0.022" pos="0 0.4 0" euler="0 0 0"/>
        </body>
    </worldbody>
</mujoco>

The environment class 

We start by subclassing MIMoEnv, adjusting the default parameters for our experiment. The model path points to our scene XML. We don’t need touch or vision, so we disable them by passing None. Proprioception and vestibular will use the default parameters. Since we want fixed length episodes we will set done_active to False. The parameters are simply passed through to the parent class.

class MIMoStandupEnv(MIMoEnv):
    def __init__(self,
             model_path=STANDUP_XML,
             proprio_params=DEFAULT_PROPRIOCEPTION_PARAMS,
             touch_params=None,
             vision_params=None,
             vestibular_params=DEFAULT_VESTIBULAR_PARAMS,
             done_active=False,
             **kwargs,
             ):

        super().__init__(model_path=model_path,
                         proprio_params=proprio_params,
                         touch_params=touch_params,
                         vision_params=vision_params,
                         vestibular_params=vestibular_params,
                         done_active=done_active,
                         **kwargs,)

Next we need to override all the abstract functions. We will use the head height as our goal variable:

def get_achieved_goal(self):
    return self.data.body('head').xpos[2]

Since we want fixed length episodes and have disabled done_active we don’t need any of the other goal related functions and just implement them as dummy functions:

def is_success(self, achieved_goal, desired_goal):
    return False

def is_failure(self, achieved_goal, desired_goal):
    return False

def is_truncated(self):
    return False

def sample_goal(self):
    return 0.0

The only things still missing are the reward and the reset functions. The reward will consist of a positive component based on the head height, determined in get_achieved_goal, and a penalty for large actions:

def compute_reward(self, achieved_goal, desired_goal, info):
    quad_ctrl_cost = 0.01 * np.square(self.data.ctrl).sum()
    reward = achieved_goal - 0.2 - quad_ctrl_cost
    return reward

Finally we need to be able to reset the simulation. We reset all the positions to the state from the XML and then slightly randomize all the joint positions, stored in the qpos array. The first seven entries belong to the free joint between MIMo and the world, so we exclude those from the randomization. The crib does not have joints and other joints in the scene belong to MIMo. We then set the state with our new randomized positions and let the simulation settle for a few timesteps:

def reset_model(self):
    self.set_state(self.init_qpos, self.init_qvel)
    qpos = self.init_crouch_position

    # set initial positions stochastically
    qpos[7:] = qpos[7:] + self.np_random.uniform(low=-0.01, high=0.01, size=len(qpos[7:]))

    # set initial velocities to zero
    qvel = np.zeros(self.data.qvel.ravel().shape)

    self.set_state(qpos, qvel)

    # perform 100 steps with no actions to stabilize initial position
    actions = np.zeros(self.action_space.shape)
    self._set_action(actions)
    mujoco.mj_step(self.model, self.data, nstep=100)

    return self._get_obs()

Finally we register our new environment with gym by adding these lines to mimoEnv/__init__.py, which also lets us set our fixed episode length:

register(id='MIMoStandup-v0',
         entry_point='mimoEnv.envs:MIMoStandupEnv',
         max_episode_steps=500,
        )

We can then create our new environment with:

import gym
import mimoEnv

env = gym.make('MIMoStandup-v0')

Creating your own experiment

The scene XML

The environment class

Creating your own experiment 

The scene XML 

The environment class 