garage.envs.multi_env_wrapper
¶
A wrapper env that handles multiple tasks from different envs.
Useful while training multi-task reinforcement learning algorithms. It provides observations augmented with one-hot representation of tasks.
-
round_robin_strategy
(num_tasks, last_task=None)¶ A function for sampling tasks in round robin fashion.
-
uniform_random_strategy
(num_tasks, _)¶ A function for sampling tasks uniformly at random.
-
class
MultiEnvWrapper
(envs, sample_strategy=uniform_random_strategy, mode='add-onehot', env_names=None)¶ Bases:
garage.Wrapper
A wrapper class to handle multiple environments.
This wrapper adds an integer ‘task_id’ to env_info every timestep.
- Parameters
envs (list(Environment)) – A list of objects implementing Environment.
sample_strategy (function(int, int)) – Sample strategy to be used when sampling a new task.
mode (str) –
A string from ‘vanilla`, ‘add-onehot’ and ‘del-onehot’. The type of observation to use. - ‘vanilla’ provides the observation as it is.
- Use case: metaworld environments with MT* algorithms,
gym environments with Task Embedding.
’add-onehot’ will append an one-hot task id to observation. Use case: gym environments with MT* algorithms.
’del-onehot’ assumes an one-hot task id is appended to observation, and it excludes that. Use case: metaworld environments with Task Embedding.
env_names (list(str)) – The names of the environments corresponding to envs. The index of an env_name must correspond to the index of the corresponding env in envs. An env_name in env_names must be unique.
-
property
observation_space
(self)¶ Observation space.
- Returns
Observation space.
- Return type
akro.Box
-
property
spec
(self)¶ Describes the action and observation spaces of the wrapped envs.
- Returns
- the action and observation spaces of the
wrapped environments.
- Return type
-
property
task_space
(self)¶ Task Space.
- Returns
Task space.
- Return type
akro.Box
-
property
active_task_index
(self)¶ Index of active task env.
- Returns
Index of active task.
- Return type
-
reset
(self)¶ Sample new task and call reset on new task env.
- Returns
- The first observation conforming to
observation_space.
- dict: The episode-level information.
Note that this is not part of env_info provided in step(). It contains information of he entire episode, which could be needed to determine the first action (e.g. in the case of goal-conditioned or MTRL.)
- Return type
numpy.ndarray
-
step
(self, action)¶ Step the active task env.
-
close
(self)¶ Close all task envs.
-
property
action_space
(self)¶ akro.Space: The action space specification.
-
property
render_modes
(self)¶ list: A list of string representing the supported render modes.
-
render
(self, mode)¶ Render the wrapped environment.
-
visualize
(self)¶ Creates a visualization of the wrapped environment.
-
property
unwrapped
(self)¶ garage.Environment: The inner environment.