garage.envs package

Garage wrappers for gym environments.

class GarageEnv(env=None, env_name='')[source]

Bases: gym.core.Wrapper

Returns an abstract Garage wrapper class for gym.Env.

In order to provide pickling (serialization) and parameterization for gym.Envs, they must be wrapped with a GarageEnv. This ensures compatibility with existing samplers and checkpointing when the envs are passed internally around garage.

Furthermore, classes inheriting from GarageEnv should silently convert action_space and observation_space from gym.Spaces to akro.spaces.

Parameters:
  • env (gym.Env) – An env that will be wrapped
  • env_name (str) – If the env_name is speficied, a gym environment with that name will be created. If such an environment does not exist, a gym.error is thrown.
close()[source]

Close the wrapped env.

reset(**kwargs)[source]

Call reset on wrapped env.

This method is necessary to suppress a deprecated warning thrown by gym.Wrapper.

Parameters:kwargs – Keyword args
Returns:The initial observation.
Return type:object
step(action)[source]

Call step on wrapped env.

This method is necessary to suppress a deprecated warning thrown by gym.Wrapper.

Parameters:action (object) – An action provided by the agent.
Returns:Agent’s observation of the current environment float : Amount of reward returned after previous action bool : Whether the episode has ended, in which case further step()
calls will return undefined results
dict: Contains auxiliary diagnostic information (helpful for
debugging, and sometimes learning)
Return type:object
Step(observation, reward, done, **kwargs)[source]

Create a namedtuple from the results of environment.step(action).

Provides the option to put extra diagnostic info in the kwargs (if it exists) without demanding an explicit positional argument.

Parameters:
  • observation (object) – Agent’s observation of the current environment
  • reward (float) – Amount of reward returned after previous action
  • done (bool) – Whether the episode has ended, in which case further step() calls will return undefined results
  • kwargs – Keyword args
Returns:

A named tuple of the arguments.

Return type:

collections.namedtuple

class EnvSpec(observation_space, action_space)[source]

Bases: object

EnvSpec class.

Parameters:
  • observation_space (akro.Space) – The observation space of the env.
  • action_space (akro.Space) – The action space of the env.
class GridWorldEnv(desc='4x4')[source]

Bases: gym.core.Env

‘S’ : starting point
‘F’ or ‘.’: free space
‘W’ or ‘x’: wall
‘H’ or ‘o’: hole (terminates episode)
‘G’ : goal
static action_from_direction(d)[source]

Return the action corresponding to the given direction. This is a helper method for debugging and testing purposes. :return: the action index corresponding to the given direction

action_space
get_possible_next_states(state, action)[source]

Given the state and action, return a list of possible next states and their probabilities. Only next states with nonzero probabilities will be returned :param state: start state :param action: action :return: a list of pairs (s’, p(s’|s,a))

log_diagnostics(paths)[source]
observation_space
render(mode='human')[source]

Renders the environment.

The set of supported modes varies per environment. (And some environments do not support rendering at all.) By convention, if mode is:

  • human: render to the current display or terminal and return nothing. Usually for human consumption.
  • rgb_array: Return an numpy.ndarray with shape (x, y, 3), representing RGB values for an x-by-y pixel image, suitable for turning into a video.
  • ansi: Return a string (str) or StringIO.StringIO containing a terminal-style text representation. The text can include newlines and ANSI escape sequences (e.g. for colors).

Note

Make sure that your class’s metadata ‘render.modes’ key includes
the list of supported modes. It’s recommended to call super() in implementations to use the functionality of this method.
Parameters:mode (str) – the mode to render with

Example:

class MyEnv(Env):

metadata = {‘render.modes’: [‘human’, ‘rgb_array’]}

def render(self, mode=’human’):
if mode == ‘rgb_array’:
return np.array(…) # return RGB frame suitable for video
elif mode == ‘human’:
… # pop up a window and render
else:
super(MyEnv, self).render(mode=mode) # just raise an exception
reset()[source]

Resets the state of the environment and returns an initial observation.

Returns:the initial observation.
Return type:observation (object)
step(action)[source]

action map: 0: left 1: down 2: right 3: up :param action: should be a one-hot vector encoding the action :return:

normalize

alias of garage.envs.normalized_env.NormalizedEnv

class PointEnv(goal=array([1., 1.], dtype=float32), done_bonus=0.0, never_done=False)[source]

Bases: gym.core.Env

A simple 2D point environment.

observation_space

The observation space

Type:gym.spaces.Box
action_space

The action space

Type:gym.spaces.Box
Parameters:
  • goal (np.ndarray, optional) – A 2D array representing the goal position
  • done_bonus (float, optional) – A numerical bonus added to the reward once the point as reached the goal
  • never_done (bool, optional) – Never send a done signal, even if the agent achieves the goal.
action_space
observation_space
render(mode='human')[source]

Renders the environment.

The set of supported modes varies per environment. (And some environments do not support rendering at all.) By convention, if mode is:

  • human: render to the current display or terminal and return nothing. Usually for human consumption.
  • rgb_array: Return an numpy.ndarray with shape (x, y, 3), representing RGB values for an x-by-y pixel image, suitable for turning into a video.
  • ansi: Return a string (str) or StringIO.StringIO containing a terminal-style text representation. The text can include newlines and ANSI escape sequences (e.g. for colors).

Note

Make sure that your class’s metadata ‘render.modes’ key includes
the list of supported modes. It’s recommended to call super() in implementations to use the functionality of this method.
Parameters:mode (str) – the mode to render with

Example:

class MyEnv(Env):

metadata = {‘render.modes’: [‘human’, ‘rgb_array’]}

def render(self, mode=’human’):
if mode == ‘rgb_array’:
return np.array(…) # return RGB frame suitable for video
elif mode == ‘human’:
… # pop up a window and render
else:
super(MyEnv, self).render(mode=mode) # just raise an exception
reset()[source]

Resets the state of the environment and returns an initial observation.

Returns:the initial observation.
Return type:observation (object)
step(action)[source]

Run one timestep of the environment’s dynamics. When end of episode is reached, you are responsible for calling reset() to reset this environment’s state.

Accepts an action and returns a tuple (observation, reward, done, info).

Parameters:action (object) – an action provided by the agent
Returns:agent’s observation of the current environment reward (float) : amount of reward returned after previous action done (bool): whether the episode has ended, in which case further step() calls will return undefined results info (dict): contains auxiliary diagnostic information (helpful for debugging, and sometimes learning)
Return type:observation (object)