garage.envs package¶

Garage wrappers for gym environments.

class GarageEnv(env=None, env_name='')[source]¶

Bases: gym.core.Wrapper

Returns an abstract Garage wrapper class for gym.Env.

In order to provide pickling (serialization) and parameterization for gym.Envs, they must be wrapped with a GarageEnv. This ensures compatibility with existing samplers and checkpointing when the envs are passed internally around garage.

Furthermore, classes inheriting from GarageEnv should silently convert action_space and observation_space from gym.Spaces to akro.spaces.

Parameters:	env (gym.Env) – An env that will be wrapped env_name (str) – If the env_name is speficied, a gym environment with that name will be created. If such an environment does not exist, a gym.error is thrown.

close()[source]¶: Close the wrapped env.

reset(**kwargs)[source]¶

Call reset on wrapped env.

This method is necessary to suppress a deprecated warning thrown by gym.Wrapper.

Parameters:	kwargs – Keyword args
Returns:	The initial observation.
Return type:	object

step(action)[source]¶

Call step on wrapped env.

This method is necessary to suppress a deprecated warning thrown by gym.Wrapper.

Parameters: action (object) – An action provided by the agent.

Returns:

Agent’s observation of the current environment float : Amount of reward returned after previous action bool : Whether the episode has ended, in which case further step()

calls will return undefined results

dict: Contains auxiliary diagnostic information (helpful for: debugging, and sometimes learning)

Return type: object

Step(observation, reward, done, **kwargs)[source]¶

Create a namedtuple from the results of environment.step(action).

Provides the option to put extra diagnostic info in the kwargs (if it exists) without demanding an explicit positional argument.

Parameters:	observation (object) – Agent’s observation of the current environment reward (float) – Amount of reward returned after previous action done (bool) – Whether the episode has ended, in which case further step() calls will return undefined results kwargs – Keyword args
Returns:	A named tuple of the arguments.
Return type:	collections.namedtuple

class EnvSpec(observation_space, action_space)[source]¶

Bases: object

EnvSpec class.

Parameters:	observation_space (akro.Space) – The observation space of the env. action_space (akro.Space) – The action space of the env.

class GridWorldEnv(desc='4x4')[source]¶

Bases: gym.core.Env

‘S’ : starting point
‘F’ or ‘.’: free space
‘W’ or ‘x’: wall
‘H’ or ‘o’: hole (terminates episode)
‘G’ : goal

static action_from_direction(d)[source]¶: Return the action corresponding to the given direction. This is a helper method for debugging and testing purposes. :return: the action index corresponding to the given direction

action_space¶

get_possible_next_states(state, action)[source]¶: Given the state and action, return a list of possible next states and their probabilities. Only next states with nonzero probabilities will be returned :param state: start state :param action: action :return: a list of pairs (s’, p(s’|s,a))

log_diagnostics(paths)[source]¶

observation_space¶

render(mode='human')[source]¶

Renders the environment.

The set of supported modes varies per environment. (And some environments do not support rendering at all.) By convention, if mode is:

human: render to the current display or terminal and return nothing. Usually for human consumption.
rgb_array: Return an numpy.ndarray with shape (x, y, 3), representing RGB values for an x-by-y pixel image, suitable for turning into a video.
ansi: Return a string (str) or StringIO.StringIO containing a terminal-style text representation. The text can include newlines and ANSI escape sequences (e.g. for colors).

Note

Make sure that your class’s metadata ‘render.modes’ key includes: the list of supported modes. It’s recommended to call super() in implementations to use the functionality of this method.

Parameters:	mode (str) – the mode to render with

Example:

class MyEnv(Env):

metadata = {‘render.modes’: [‘human’, ‘rgb_array’]}

def render(self, mode=’human’):

if mode == ‘rgb_array’:: return np.array(…) # return RGB frame suitable for video
elif mode == ‘human’:: … # pop up a window and render
else:: super(MyEnv, self).render(mode=mode) # just raise an exception

reset()[source]¶

Resets the state of the environment and returns an initial observation.

Returns:	the initial observation.
Return type:	observation (object)

step(action)[source]¶: action map: 0: left 1: down 2: right 3: up :param action: should be a one-hot vector encoding the action :return:

normalize¶: alias of garage.envs.normalized_env.NormalizedEnv

class PointEnv(goal=array([1., 1.], dtype=float32), done_bonus=0.0, never_done=False)[source]¶

Bases: gym.core.Env

A simple 2D point environment.

observation_space¶

The observation space

Type:	`gym.spaces.Box`

action_space¶

The action space

Type:	`gym.spaces.Box`

Parameters:	goal (`np.ndarray`, optional) – A 2D array representing the goal position done_bonus (float, optional) – A numerical bonus added to the reward once the point as reached the goal never_done (bool, optional) – Never send a done signal, even if the agent achieves the goal.

action_space

observation_space

render(mode='human')[source]¶

Renders the environment.

The set of supported modes varies per environment. (And some environments do not support rendering at all.) By convention, if mode is:

human: render to the current display or terminal and return nothing. Usually for human consumption.
rgb_array: Return an numpy.ndarray with shape (x, y, 3), representing RGB values for an x-by-y pixel image, suitable for turning into a video.
ansi: Return a string (str) or StringIO.StringIO containing a terminal-style text representation. The text can include newlines and ANSI escape sequences (e.g. for colors).

Note

Make sure that your class’s metadata ‘render.modes’ key includes: the list of supported modes. It’s recommended to call super() in implementations to use the functionality of this method.

Parameters:	mode (str) – the mode to render with

Example:

class MyEnv(Env):

metadata = {‘render.modes’: [‘human’, ‘rgb_array’]}

def render(self, mode=’human’):

if mode == ‘rgb_array’:: return np.array(…) # return RGB frame suitable for video
elif mode == ‘human’:: … # pop up a window and render
else:: super(MyEnv, self).render(mode=mode) # just raise an exception

reset()[source]¶

Resets the state of the environment and returns an initial observation.

Returns:	the initial observation.
Return type:	observation (object)

step(action)[source]¶

Run one timestep of the environment’s dynamics. When end of episode is reached, you are responsible for calling reset() to reset this environment’s state.

Accepts an action and returns a tuple (observation, reward, done, info).

Parameters:	action (object) – an action provided by the agent
Returns:	agent’s observation of the current environment reward (float) : amount of reward returned after previous action done (bool): whether the episode has ended, in which case further step() calls will return undefined results info (dict): contains auxiliary diagnostic information (helpful for debugging, and sometimes learning)
Return type:	observation (object)

garage.envs package¶

Subpackages¶

Submodules¶