`garage.envs`¶

Garage wrappers for gym environments.

class GridWorldEnv(desc='4x4', max_episode_length=None)¶

Bases: garage.Environment

Inheritance diagram of garage.envs.GridWorldEnv

A simply 2D grid environment.

‘S’ : starting point
‘F’ or ‘.’: free space
‘W’ or ‘x’: wall
‘H’ or ‘o’: hole (terminates episode)
‘G’ : goal

action_space¶

The action space specification.

Type:	akro.Space

observation_space¶

The observation space specification.

Type:	akro.Space

spec¶

The environment specification.

Type:	EnvSpec

render_modes¶

A list of string representing the supported render modes.

Type:	list

reset(self)¶

Resets the environment.

Returns:	The first observation conforming to observation_space. dict: The episode-level information. Note that this is not part of env_info provided in step(). It contains information of he entire episode， which could be needed to determine the first action (e.g. in the case of goal-conditioned or MTRL.)
Return type:	numpy.ndarray

step(self, action)¶

Steps the environment.

action map: 0: left 1: down 2: right 3: up

Parameters:	action (int) – an int encoding the action
Returns:	The environment step resulting from the action.
Return type:	EnvStep
Raises:	`RuntimeError` – if step() is called after the environment has been constructed and reset() has not been called. `NotImplementedError` – if a next step in self._desc does not match known state type.

render(self, mode)¶

Renders the environment.

Parameters:	mode (str) – the mode to render with. The string must be present in Environment.render_modes.

visualize(self)¶: Creates a visualization of the environment.

close(self)¶: Close the env.

class GymEnv(env, is_image=False, max_episode_length=None)¶

Bases: garage.Environment

Inheritance diagram of garage.envs.GymEnv

Returns an abstract Garage wrapper class for gym.Env.

In order to provide pickling (serialization) and parameterization for gym.Env instances, they must be wrapped with GymEnv. This ensures compatibility with existing samplers and checkpointing when the envs are passed internally around garage.

Furthermore, classes inheriting from GymEnv should silently convert :attribute:`action_space` and :attribute:`observation_space` from gym.Space to akro.Space.

GymEnv handles all environments created by make().

It returns a different wrapper class instance if the input environment requires special handling. Current supported wrapper classes are:

garage.envs.bullet.BulletEnv for Bullet-based gym environments.

See __new__() for details.

action_space¶

The action space specification.

Type:	akro.Space

observation_space¶

The observation space specification.

Type:	akro.Space

spec¶

The envionrment specification.

Type:	garage.envs.env_spec.EnvSpec

render_modes¶

A list of string representing the supported render modes.

Type:	list

reset(self)¶

Call reset on wrapped env.

Returns:	The first observation conforming to observation_space. dict: The episode-level information. Note that this is not part of env_info provided in step(). It contains information of he entire episode， which could be needed to determine the first action (e.g. in the case of goal-conditioned or MTRL.)
Return type:	numpy.ndarray

step(self, action)¶

Call step on wrapped env.

Parameters:	action (np.ndarray) – An action provided by the agent.
Returns:	The environment step resulting from the action.
Return type:	EnvStep
Raises:	`RuntimeError` – if step() is called after the environment has been constructed and reset() has not been called.

render(self, mode)¶

Renders the environment.

Parameters:	mode (str) – the mode to render with. The string must be present in self.render_modes.
Returns:	the return value for render, depending on each env.
Return type:	object

visualize(self)¶: Creates a visualization of the environment.

close(self)¶: Close the wrapped env.

class MultiEnvWrapper(envs, sample_strategy=uniform_random_strategy, mode='add-onehot', env_names=None)¶

Bases: garage.Wrapper

Inheritance diagram of garage.envs.MultiEnvWrapper

A wrapper class to handle multiple environments.

This wrapper adds an integer ‘task_id’ to env_info every timestep.

Parameters:

envs (list(Environment)) – A list of objects implementing Environment.
sample_strategy (function(int, int)) – Sample strategy to be used when sampling a new task.
mode (str) –
A string from ‘vanilla`, ‘add-onehot’ and ‘del-onehot’. The type of observation to use. - ‘vanilla’ provides the observation as it is.

Use case: metaworld environments with MT* algorithms,

gym environments with Task Embedding.
- ’add-onehot’ will append an one-hot task id to observation. Use case: gym environments with MT* algorithms.
- ’del-onehot’ assumes an one-hot task id is appended to observation, and it excludes that. Use case: metaworld environments with Task Embedding.
env_names (list(str)) – The names of the environments corresponding to envs. The index of an env_name must correspond to the index of the corresponding env in envs. An env_name in env_names must be unique.

observation_space¶

Observation space.

Returns:	Observation space.
Return type:	akro.Box

spec¶

Describes the action and observation spaces of the wrapped envs.

Returns:	the action and observation spaces of the wrapped environments.
Return type:	EnvSpec

num_tasks¶

Total number of tasks.

Returns:	number of tasks.
Return type:	int

task_space¶

Task Space.

Returns:	Task space.
Return type:	akro.Box

active_task_index¶

Index of active task env.

Returns:	Index of active task.
Return type:	int

action_space¶

The action space specification.

Type:	akro.Space

render_modes¶

A list of string representing the supported render modes.

Type:	list

reset(self)¶

Sample new task and call reset on new task env.

Returns:	The first observation conforming to observation_space. dict: The episode-level information. Note that this is not part of env_info provided in step(). It contains information of he entire episode， which could be needed to determine the first action (e.g. in the case of goal-conditioned or MTRL.)
Return type:	numpy.ndarray

step(self, action)¶

Step the active task env.

Parameters:	action (object) – object to be passed in Environment.reset(action)
Returns:	The environment step resulting from the action.
Return type:	EnvStep

close(self)¶: Close all task envs.

render(self, mode)¶

Render the wrapped environment.

Parameters:	mode (str) – the mode to render with. The string must be present in self.render_modes.
Returns:	the return value for render, depending on each env.
Return type:	object

visualize(self)¶: Creates a visualization of the wrapped environment.

normalize¶

class PointEnv(goal=np.array((1.0, 1.0), dtype=np.float32), arena_size=5.0, done_bonus=0.0, never_done=False, max_episode_length=math.inf)¶

Bases: garage.Environment

Inheritance diagram of garage.envs.PointEnv

A simple 2D point environment.

Parameters:

goal (np.ndarray) – A 2D array representing the goal position
arena_size (float) – The size of arena where the point is constrained within (-arena_size, arena_size) in each dimension
done_bonus (float) – A numerical bonus added to the reward once the point as reached the goal
never_done (bool) – Never send a done signal, even if the agent achieves the goal
max_episode_length (int) – The maximum steps allowed for an episode.

action_space¶

The action space specification.

Type:	akro.Space

observation_space¶

The observation space specification.

Type:	akro.Space

spec¶

The environment specification.

Type:	EnvSpec

render_modes¶

A list of string representing the supported render modes.

Type:	list

reset(self)¶

Reset the environment.

Returns:	The first observation conforming to observation_space. dict: The episode-level information. Note that this is not part of env_info provided in step(). It contains information of he entire episode， which could be needed to determine the first action (e.g. in the case of goal-conditioned or MTRL.)
Return type:	numpy.ndarray

step(self, action)¶

Step the environment.

Parameters:	action (np.ndarray) – An action provided by the agent.
Returns:	The environment step resulting from the action.
Return type:	EnvStep
Raises:	`RuntimeError` – if step() is called after the environment has been – constructed and reset() has not been called.

render(self, mode)¶

Renders the environment.

Parameters:	mode (str) – the mode to render with. The string must be present in self.render_modes.
Returns:	the point and goal of environment.
Return type:	str

visualize(self)¶: Creates a visualization of the environment.

close(self)¶: Close the env.

sample_tasks(self, num_tasks)¶

Sample a list of num_tasks tasks.

Parameters:	num_tasks (int) – Number of tasks to sample.
Returns:	A list of “tasks”, where each task is a dictionary containing a single key, “goal”, mapping to a point in 2D space.
Return type:	list[dict[str, np.ndarray]]

set_task(self, task)¶

Reset with a task.

Parameters:	task (dict[str, np.ndarray]) – A task (a dictionary containing a single key, “goal”, which should be a point in 2D space).

class TaskOnehotWrapper(env, task_index, n_total_tasks)¶

Bases: garage.Wrapper

Inheritance diagram of garage.envs.TaskOnehotWrapper

Append a one-hot task representation to an environment.

See TaskOnehotWrapper.wrap_env_list for the recommended way of creating this class.

Parameters:	env (Environment) – The environment to wrap. task_index (int) – The index of this task among the tasks. n_total_tasks (int) – The number of total tasks.

observation_space¶

The observation space specification.

Type:	akro.Space

spec¶

Return the environment specification.

Returns:	The envionrment specification.
Return type:	EnvSpec

action_space¶

The action space specification.

Type:	akro.Space

render_modes¶

A list of string representing the supported render modes.

Type:	list

reset(self)¶

Sample new task and call reset on new task env.

Returns:	The first observation conforming to observation_space. dict: The episode-level information. Note that this is not part of env_info provided in step(). It contains information of he entire episode， which could be needed to determine the first action (e.g. in the case of goal-conditioned or MTRL.)
Return type:	numpy.ndarray

step(self, action)¶

Environment step for the active task env.

Parameters:	action (np.ndarray) – Action performed by the agent in the environment.
Returns:	The environment step resulting from the action.
Return type:	EnvStep

classmethod wrap_env_list(cls, envs)¶

Wrap a list of environments, giving each environment a one-hot.

This is the primary way of constructing instances of this class. It’s mostly useful when training multi-task algorithms using a multi-task aware sampler.

For example: ‘’’ .. code-block:: python

envs = get_mt10_envs() wrapped = TaskOnehotWrapper.wrap_env_list(envs) sampler = runner.make_sampler(LocalSampler, env=wrapped)

‘’‘

Parameters:	envs (list[Environment]) – List of environments to wrap. Note the (that) – order these environments are passed in determines the value of their one-hot encoding. It is essential that this list is always in the same order, or the resulting encodings will be inconsistent.
Returns:	The wrapped environments.
Return type:	list[TaskOnehotWrapper]

classmethod wrap_env_cons_list(cls, env_cons)¶

Wrap a list of environment constructors, giving each a one-hot.

This function is useful if you want to avoid constructing any environments in the main experiment process, and are using a multi-task aware remote sampler (i.e. ~RaySampler).

For example: ‘’’ .. code-block:: python

env_constructors = get_mt10_env_cons() wrapped = TaskOnehotWrapper.wrap_env_cons_list(env_constructors) env_updates = [NewEnvUpdate(wrapped_con)

for wrapped_con in wrapped]

sampler = runner.make_sampler(RaySampler, env=env_updates)

‘’‘

Parameters:	env_cons (list[Callable[Environment]]) – List of environment constructor – to wrap. Note that the order these constructors are passed in determines the value of their one-hot encoding. It is essential that this list is always in the same order, or the resulting encodings will be inconsistent.
Returns:	The wrapped environments.
Return type:	list[Callable[TaskOnehotWrapper]]

render(self, mode)¶

Render the wrapped environment.

Parameters:	mode (str) – the mode to render with. The string must be present in self.render_modes.
Returns:	the return value for render, depending on each env.
Return type:	object

visualize(self)¶: Creates a visualization of the wrapped environment.

close(self)¶: Close the wrapped env.

garage.envs¶

`garage.envs`¶