garage.envs.multi_env_wrapper

A wrapper env that handles multiple tasks from different envs.

Useful while training multi-task reinforcement learning algorithms. It provides observations augmented with one-hot representation of tasks.

round_robin_strategy(num_tasks, last_task=None)

A function for sampling tasks in round robin fashion.

Parameters
  • num_tasks (int) – Total number of tasks.

  • last_task (int) – Previously sampled task.

Returns

task id.

Return type

int

uniform_random_strategy(num_tasks, _)

A function for sampling tasks uniformly at random.

Parameters
  • num_tasks (int) – Total number of tasks.

  • _ (object) – Ignored by this sampling strategy.

Returns

task id.

Return type

int

class MultiEnvWrapper(envs, sample_strategy=uniform_random_strategy, mode='add-onehot', env_names=None)

Bases: garage.Wrapper

Inheritance diagram of garage.envs.multi_env_wrapper.MultiEnvWrapper

A wrapper class to handle multiple environments.

This wrapper adds an integer ‘task_id’ to env_info every timestep.

Parameters
  • envs (list(Environment)) – A list of objects implementing Environment.

  • sample_strategy (function(int, int)) – Sample strategy to be used when sampling a new task.

  • mode (str) –

    A string from ‘vanilla`, ‘add-onehot’ and ‘del-onehot’. The type of observation to use. - ‘vanilla’ provides the observation as it is.

    Use case: metaworld environments with MT* algorithms,

    gym environments with Task Embedding.

    • ’add-onehot’ will append an one-hot task id to observation. Use case: gym environments with MT* algorithms.

    • ’del-onehot’ assumes an one-hot task id is appended to observation, and it excludes that. Use case: metaworld environments with Task Embedding.

  • env_names (list(str)) – The names of the environments corresponding to envs. The index of an env_name must correspond to the index of the corresponding env in envs. An env_name in env_names must be unique.

property observation_space

Observation space.

Returns

Observation space.

Return type

akro.Box

property spec

Describes the action and observation spaces of the wrapped envs.

Returns

the action and observation spaces of the

wrapped environments.

Return type

EnvSpec

property num_tasks

Total number of tasks.

Returns

number of tasks.

Return type

int

property task_space

Task Space.

Returns

Task space.

Return type

akro.Box

property active_task_index

Index of active task env.

Returns

Index of active task.

Return type

int

property action_space

The action space specification.

Type

akro.Space

property render_modes

A list of string representing the supported render modes.

Type

list

property unwrapped

The inner environment.

Type

garage.Environment

reset()

Sample new task and call reset on new task env.

Returns

The first observation conforming to

observation_space.

dict: The episode-level information.

Note that this is not part of env_info provided in step(). It contains information of he entire episode, which could be needed to determine the first action (e.g. in the case of goal-conditioned or MTRL.)

Return type

numpy.ndarray

step(action)

Step the active task env.

Parameters

action (object) – object to be passed in Environment.reset(action)

Returns

The environment step resulting from the action.

Return type

EnvStep

close()

Close all task envs.

render(mode)

Render the wrapped environment.

Parameters

mode (str) – the mode to render with. The string must be present in self.render_modes.

Returns

the return value for render, depending on each env.

Return type

object

visualize()

Creates a visualization of the wrapped environment.