garage.envs¶
Garage wrappers for gym environments.
-
class
GridWorldEnv(desc='4x4', max_episode_length=None)¶ Bases:
garage.Environment
A simply 2D grid environment.
‘S’ : starting point‘F’ or ‘.’: free space‘W’ or ‘x’: wall‘H’ or ‘o’: hole (terminates episode)‘G’ : goal-
property
action_space(self)¶ akro.Space: The action space specification.
-
property
observation_space(self)¶ akro.Space: The observation space specification.
-
property
spec(self)¶ EnvSpec: The environment specification.
-
property
render_modes(self)¶ list: A list of string representing the supported render modes.
-
reset(self)¶ Resets the environment.
- Returns
- The first observation conforming to
observation_space.
- dict: The episode-level information.
Note that this is not part of env_info provided in step(). It contains information of he entire episode, which could be needed to determine the first action (e.g. in the case of goal-conditioned or MTRL.)
- Return type
numpy.ndarray
-
step(self, action)¶ Steps the environment.
action map: 0: left 1: down 2: right 3: up
- Parameters
action (int) – an int encoding the action
- Returns
The environment step resulting from the action.
- Return type
- Raises
RuntimeError – if step() is called after the environment has been constructed and reset() has not been called.
NotImplementedError – if a next step in self._desc does not match known state type.
-
render(self, mode)¶ Renders the environment.
- Parameters
mode (str) – the mode to render with. The string must be present in Environment.render_modes.
-
visualize(self)¶ Creates a visualization of the environment.
-
close(self)¶ Close the env.
-
property
-
class
GymEnv(env, is_image=False, max_episode_length=None)¶ Bases:
garage.Environment
Returns an abstract Garage wrapper class for gym.Env.
In order to provide pickling (serialization) and parameterization for
gym.Envinstances, they must be wrapped withGymEnv. This ensures compatibility with existing samplers and checkpointing when the envs are passed internally around garage.Furthermore, classes inheriting from
GymEnvshould silently convert :attribute:`action_space` and :attribute:`observation_space` fromgym.Spacetoakro.Space.GymEnvhandles all environments created bymake().It returns a different wrapper class instance if the input environment requires special handling. Current supported wrapper classes are:
garage.envs.bullet.BulletEnv for Bullet-based gym environments.
See __new__() for details.
-
property
action_space(self)¶ akro.Space: The action space specification.
-
property
observation_space(self)¶ akro.Space: The observation space specification.
-
property
spec(self)¶ garage.envs.env_spec.EnvSpec: The envionrment specification.
-
property
render_modes(self)¶ list: A list of string representing the supported render modes.
-
reset(self)¶ Call reset on wrapped env.
- Returns
- The first observation conforming to
observation_space.
- dict: The episode-level information.
Note that this is not part of env_info provided in step(). It contains information of he entire episode, which could be needed to determine the first action (e.g. in the case of goal-conditioned or MTRL.)
- Return type
numpy.ndarray
-
step(self, action)¶ Call step on wrapped env.
- Parameters
action (np.ndarray) – An action provided by the agent.
- Returns
The environment step resulting from the action.
- Return type
- Raises
RuntimeError – if step() is called after the environment has been constructed and reset() has not been called.
-
render(self, mode)¶ Renders the environment.
-
visualize(self)¶ Creates a visualization of the environment.
-
close(self)¶ Close the wrapped env.
-
property
-
class
MetaWorldSetTaskEnv(benchmark=None, kind=None, wrapper=None, add_env_onehot=False)¶ Bases:
garage._environment.Environment
Environment form of a MetaWorld benchmark.
This class is generally less efficient than using a TaskSampler, if that can be used instead, since each instance of this class internally caches a copy of each environment in the benchmark.
In order to sample tasks from this environment, a benchmark must be passed at construction time.
- Parameters
benchmark (metaworld.Benchmark or None) – The benchmark to wrap.
wrapper (Callable[garage.Env, garage.Env] or None) – Wrapper to apply to env instances.
add_env_onehot (bool) – If true, a one-hot representing the current environment name will be added to the environments. Should only be used with multi-task benchmarks.
- Raises
ValueError – If kind is not ‘train’, ‘test’, or None. Also raisd if add_env_onehot is used on a metaworld meta learning (not multi-task) benchmark.
-
property
num_tasks(self)¶ int: Returns number of tasks.
Part of the set_task environment protocol.
-
sample_tasks(self, n_tasks)¶ Samples n_tasks tasks.
Part of the set_task environment protocol. To call this method, a benchmark must have been passed in at environment construction.
-
set_task(self, task)¶ Set the task.
Part of the set_task environment protocol.
-
property
action_space(self)¶ akro.Space: The action space specification.
-
property
observation_space(self)¶ akro.Space: The observation space specification.
-
property
spec(self)¶ EnvSpec: The envionrment specification.
-
property
render_modes(self)¶ list: A list of string representing the supported render modes.
-
step(self, action)¶ Step the wrapped env.
- Parameters
action (np.ndarray) – An action provided by the agent.
- Returns
The environment step resulting from the action.
- Return type
-
reset(self)¶ Reset the wrapped env.
- Returns
- The first observation conforming to
observation_space.
- dict: The episode-level information.
Note that this is not part of env_info provided in step(). It contains information of he entire episode, which could be needed to determine the first action (e.g. in the case of goal-conditioned or MTRL.)
- Return type
numpy.ndarray
-
render(self, mode)¶ Render the wrapped environment.
-
visualize(self)¶ Creates a visualization of the wrapped environment.
-
close(self)¶ Close the wrapped env.
-
class
MultiEnvWrapper(envs, sample_strategy=uniform_random_strategy, mode='add-onehot', env_names=None)¶ Bases:
garage.Wrapper
A wrapper class to handle multiple environments.
This wrapper adds an integer ‘task_id’ to env_info every timestep.
- Parameters
envs (list(Environment)) – A list of objects implementing Environment.
sample_strategy (function(int, int)) – Sample strategy to be used when sampling a new task.
mode (str) –
A string from ‘vanilla`, ‘add-onehot’ and ‘del-onehot’. The type of observation to use. - ‘vanilla’ provides the observation as it is.
- Use case: metaworld environments with MT* algorithms,
gym environments with Task Embedding.
’add-onehot’ will append an one-hot task id to observation. Use case: gym environments with MT* algorithms.
’del-onehot’ assumes an one-hot task id is appended to observation, and it excludes that. Use case: metaworld environments with Task Embedding.
env_names (list(str)) – The names of the environments corresponding to envs. The index of an env_name must correspond to the index of the corresponding env in envs. An env_name in env_names must be unique.
-
property
observation_space(self)¶ Observation space.
- Returns
Observation space.
- Return type
akro.Box
-
property
spec(self)¶ Describes the action and observation spaces of the wrapped envs.
- Returns
- the action and observation spaces of the
wrapped environments.
- Return type
-
property
task_space(self)¶ Task Space.
- Returns
Task space.
- Return type
akro.Box
-
property
active_task_index(self)¶ Index of active task env.
- Returns
Index of active task.
- Return type
-
reset(self)¶ Sample new task and call reset on new task env.
- Returns
- The first observation conforming to
observation_space.
- dict: The episode-level information.
Note that this is not part of env_info provided in step(). It contains information of he entire episode, which could be needed to determine the first action (e.g. in the case of goal-conditioned or MTRL.)
- Return type
numpy.ndarray
-
step(self, action)¶ Step the active task env.
-
close(self)¶ Close all task envs.
-
property
action_space(self)¶ akro.Space: The action space specification.
-
property
render_modes(self)¶ list: A list of string representing the supported render modes.
-
render(self, mode)¶ Render the wrapped environment.
-
visualize(self)¶ Creates a visualization of the wrapped environment.
-
property
unwrapped(self)¶ garage.Environment: The inner environment.
-
normalize¶
-
class
PointEnv(goal=np.array(1.0, 1.0, dtype=np.float32), arena_size=5.0, done_bonus=0.0, never_done=False, max_episode_length=math.inf)¶ Bases:
garage.Environment
A simple 2D point environment.
- Parameters
goal (np.ndarray) – A 2D array representing the goal position
arena_size (float) – The size of arena where the point is constrained within (-arena_size, arena_size) in each dimension
done_bonus (float) – A numerical bonus added to the reward once the point as reached the goal
never_done (bool) – Never send a done signal, even if the agent achieves the goal
max_episode_length (int) – The maximum steps allowed for an episode.
-
property
action_space(self)¶ akro.Space: The action space specification.
-
property
observation_space(self)¶ akro.Space: The observation space specification.
-
property
spec(self)¶ EnvSpec: The environment specification.
-
property
render_modes(self)¶ list: A list of string representing the supported render modes.
-
reset(self)¶ Reset the environment.
- Returns
- The first observation conforming to
observation_space.
- dict: The episode-level information.
Note that this is not part of env_info provided in step(). It contains information of he entire episode, which could be needed to determine the first action (e.g. in the case of goal-conditioned or MTRL.)
- Return type
numpy.ndarray
-
step(self, action)¶ Step the environment.
- Parameters
action (np.ndarray) – An action provided by the agent.
- Returns
The environment step resulting from the action.
- Return type
- Raises
RuntimeError – if step() is called after the environment
has been – constructed and reset() has not been called.
-
render(self, mode)¶ Renders the environment.
-
visualize(self)¶ Creates a visualization of the environment.
-
close(self)¶ Close the env.
-
sample_tasks(self, num_tasks)¶ Sample a list of num_tasks tasks.
-
class
TaskNameWrapper(env, *, task_name=None, task_id=None)¶ Bases:
garage.Wrapper
Add task_name or task_id to env infos.
- Parameters
-
step(self, action)¶ gym.Env step for the active task env.
- Parameters
action (np.ndarray) – Action performed by the agent in the environment.
- Returns
np.ndarray: Agent’s observation of the current environment. float: Amount of reward yielded by previous action. bool: True iff the episode has ended. dict[str, np.ndarray]: Contains auxiliary diagnostic
information about this time-step.
- Return type
-
property
action_space(self)¶ akro.Space: The action space specification.
-
property
observation_space(self)¶ akro.Space: The observation space specification.
-
property
spec(self)¶ EnvSpec: The environment specification.
-
property
render_modes(self)¶ list: A list of string representing the supported render modes.
-
reset(self)¶ Reset the wrapped env.
- Returns
- The first observation conforming to
observation_space.
- dict: The episode-level information.
Note that this is not part of env_info provided in step(). It contains information of he entire episode, which could be needed to determine the first action (e.g. in the case of goal-conditioned or MTRL.)
- Return type
numpy.ndarray
-
render(self, mode)¶ Render the wrapped environment.
-
visualize(self)¶ Creates a visualization of the wrapped environment.
-
close(self)¶ Close the wrapped env.
-
property
unwrapped(self)¶ garage.Environment: The inner environment.
-
class
TaskOnehotWrapper(env, task_index, n_total_tasks)¶ Bases:
garage.Wrapper
Append a one-hot task representation to an environment.
See TaskOnehotWrapper.wrap_env_list for the recommended way of creating this class.
- Parameters
env (Environment) – The environment to wrap.
task_index (int) – The index of this task among the tasks.
n_total_tasks (int) – The number of total tasks.
-
property
observation_space(self)¶ akro.Space: The observation space specification.
-
property
spec(self)¶ Return the environment specification.
- Returns
The envionrment specification.
- Return type
-
reset(self)¶ Sample new task and call reset on new task env.
- Returns
- The first observation conforming to
observation_space.
- dict: The episode-level information.
Note that this is not part of env_info provided in step(). It contains information of he entire episode, which could be needed to determine the first action (e.g. in the case of goal-conditioned or MTRL.)
- Return type
numpy.ndarray
-
step(self, action)¶ Environment step for the active task env.
- Parameters
action (np.ndarray) – Action performed by the agent in the environment.
- Returns
The environment step resulting from the action.
- Return type
-
classmethod
wrap_env_list(cls, envs)¶ Wrap a list of environments, giving each environment a one-hot.
This is the primary way of constructing instances of this class. It’s mostly useful when training multi-task algorithms using a multi-task aware sampler.
For example: ‘’’ .. code-block:: python
envs = get_mt10_envs() wrapped = TaskOnehotWrapper.wrap_env_list(envs) sampler = trainer.make_sampler(LocalSampler, env=wrapped)
‘’‘
- Parameters
envs (list[Environment]) – List of environments to wrap. Note
the (that) – order these environments are passed in determines the value of their one-hot encoding. It is essential that this list is always in the same order, or the resulting encodings will be inconsistent.
- Returns
The wrapped environments.
- Return type
-
classmethod
wrap_env_cons_list(cls, env_cons)¶ Wrap a list of environment constructors, giving each a one-hot.
This function is useful if you want to avoid constructing any environments in the main experiment process, and are using a multi-task aware remote sampler (i.e. ~RaySampler).
For example: ‘’’ .. code-block:: python
env_constructors = get_mt10_env_cons() wrapped = TaskOnehotWrapper.wrap_env_cons_list(env_constructors) env_updates = [NewEnvUpdate(wrapped_con)
for wrapped_con in wrapped]
sampler = trainer.make_sampler(RaySampler, env=env_updates)
‘’‘
- Parameters
env_cons (list[Callable[Environment]]) – List of environment
constructor – to wrap. Note that the order these constructors are passed in determines the value of their one-hot encoding. It is essential that this list is always in the same order, or the resulting encodings will be inconsistent.
- Returns
The wrapped environments.
- Return type
list[Callable[TaskOnehotWrapper]]
-
property
action_space(self)¶ akro.Space: The action space specification.
-
property
render_modes(self)¶ list: A list of string representing the supported render modes.
-
render(self, mode)¶ Render the wrapped environment.
-
visualize(self)¶ Creates a visualization of the wrapped environment.
-
close(self)¶ Close the wrapped env.
-
property
unwrapped(self)¶ garage.Environment: The inner environment.