garage.envs package¶
Garage wrappers for gym environments.
-
class
GarageEnv
(env=None, env_name='')[source]¶ Bases:
gym.core.Wrapper
Returns an abstract Garage wrapper class for gym.Env.
In order to provide pickling (serialization) and parameterization for gym.Envs, they must be wrapped with a GarageEnv. This ensures compatibility with existing samplers and checkpointing when the envs are passed internally around garage.
Furthermore, classes inheriting from GarageEnv should silently convert action_space and observation_space from gym.Spaces to akro.spaces.
Parameters: - env (gym.Env) – An env that will be wrapped
- env_name (str) – If the env_name is speficied, a gym environment with that name will be created. If such an environment does not exist, a gym.error is thrown.
-
reset
(**kwargs)[source]¶ Call reset on wrapped env.
This method is necessary to suppress a deprecated warning thrown by gym.Wrapper.
Parameters: kwargs – Keyword args Returns: The initial observation. Return type: object
-
step
(action)[source]¶ Call step on wrapped env.
This method is necessary to suppress a deprecated warning thrown by gym.Wrapper.
Parameters: action (object) – An action provided by the agent. Returns: Agent’s observation of the current environment float : Amount of reward returned after previous action bool : Whether the episode has ended, in which case further step() calls will return undefined results- dict: Contains auxiliary diagnostic information (helpful for
- debugging, and sometimes learning)
Return type: object
-
Step
(observation, reward, done, **kwargs)[source]¶ Create a namedtuple from the results of environment.step(action).
Provides the option to put extra diagnostic info in the kwargs (if it exists) without demanding an explicit positional argument.
Parameters: Returns: A named tuple of the arguments.
Return type: collections.namedtuple
-
class
EnvSpec
(observation_space, action_space)[source]¶ Bases:
object
EnvSpec class.
Parameters: - observation_space (akro.Space) – The observation space of the env.
- action_space (akro.Space) – The action space of the env.
-
class
GridWorldEnv
(desc='4x4')[source]¶ Bases:
gym.core.Env
‘S’ : starting point‘F’ or ‘.’: free space‘W’ or ‘x’: wall‘H’ or ‘o’: hole (terminates episode)‘G’ : goal-
static
action_from_direction
(d)[source]¶ Return the action corresponding to the given direction. This is a helper method for debugging and testing purposes. :return: the action index corresponding to the given direction
-
action_space
¶
-
get_possible_next_states
(state, action)[source]¶ Given the state and action, return a list of possible next states and their probabilities. Only next states with nonzero probabilities will be returned :param state: start state :param action: action :return: a list of pairs (s’, p(s’|s,a))
-
observation_space
¶
-
render
(mode='human')[source]¶ Renders the environment.
The set of supported modes varies per environment. (And some environments do not support rendering at all.) By convention, if mode is:
- human: render to the current display or terminal and return nothing. Usually for human consumption.
- rgb_array: Return an numpy.ndarray with shape (x, y, 3), representing RGB values for an x-by-y pixel image, suitable for turning into a video.
- ansi: Return a string (str) or StringIO.StringIO containing a terminal-style text representation. The text can include newlines and ANSI escape sequences (e.g. for colors).
Note
- Make sure that your class’s metadata ‘render.modes’ key includes
- the list of supported modes. It’s recommended to call super() in implementations to use the functionality of this method.
Parameters: mode (str) – the mode to render with Example:
- class MyEnv(Env):
metadata = {‘render.modes’: [‘human’, ‘rgb_array’]}
- def render(self, mode=’human’):
- if mode == ‘rgb_array’:
- return np.array(…) # return RGB frame suitable for video
- elif mode == ‘human’:
- … # pop up a window and render
- else:
- super(MyEnv, self).render(mode=mode) # just raise an exception
-
static
-
normalize
¶
-
class
PointEnv
(goal=array([1., 1.], dtype=float32), done_bonus=0.0, never_done=False)[source]¶ Bases:
gym.core.Env
A simple 2D point environment.
-
observation_space
¶ The observation space
Type: gym.spaces.Box
-
action_space
¶ The action space
Type: gym.spaces.Box
Parameters: -
action_space
-
observation_space
-
render
(mode='human')[source]¶ Renders the environment.
The set of supported modes varies per environment. (And some environments do not support rendering at all.) By convention, if mode is:
- human: render to the current display or terminal and return nothing. Usually for human consumption.
- rgb_array: Return an numpy.ndarray with shape (x, y, 3), representing RGB values for an x-by-y pixel image, suitable for turning into a video.
- ansi: Return a string (str) or StringIO.StringIO containing a terminal-style text representation. The text can include newlines and ANSI escape sequences (e.g. for colors).
Note
- Make sure that your class’s metadata ‘render.modes’ key includes
- the list of supported modes. It’s recommended to call super() in implementations to use the functionality of this method.
Parameters: mode (str) – the mode to render with Example:
- class MyEnv(Env):
metadata = {‘render.modes’: [‘human’, ‘rgb_array’]}
- def render(self, mode=’human’):
- if mode == ‘rgb_array’:
- return np.array(…) # return RGB frame suitable for video
- elif mode == ‘human’:
- … # pop up a window and render
- else:
- super(MyEnv, self).render(mode=mode) # just raise an exception
-
reset
()[source]¶ Resets the state of the environment and returns an initial observation.
Returns: the initial observation. Return type: observation (object)
-
step
(action)[source]¶ Run one timestep of the environment’s dynamics. When end of episode is reached, you are responsible for calling reset() to reset this environment’s state.
Accepts an action and returns a tuple (observation, reward, done, info).
Parameters: action (object) – an action provided by the agent Returns: agent’s observation of the current environment reward (float) : amount of reward returned after previous action done (bool): whether the episode has ended, in which case further step() calls will return undefined results info (dict): contains auxiliary diagnostic information (helpful for debugging, and sometimes learning) Return type: observation (object)
-
Subpackages¶
- garage.envs.dm_control package
- garage.envs.wrappers package
- Submodules
- garage.envs.wrappers.atari_env module
- garage.envs.wrappers.clip_reward module
- garage.envs.wrappers.episodic_life module
- garage.envs.wrappers.fire_reset module
- garage.envs.wrappers.grayscale module
- garage.envs.wrappers.max_and_skip module
- garage.envs.wrappers.noop module
- garage.envs.wrappers.resize module
- garage.envs.wrappers.stack_frames module
- Submodules