garage.envs.grid_world_env module¶

class GridWorldEnv(desc='4x4')[source]¶

Bases: gym.core.Env

‘S’ : starting point
‘F’ or ‘.’: free space
‘W’ or ‘x’: wall
‘H’ or ‘o’: hole (terminates episode)
‘G’ : goal

static action_from_direction(d)[source]¶: Return the action corresponding to the given direction. This is a helper method for debugging and testing purposes. :return: the action index corresponding to the given direction

get_possible_next_states(state, action)[source]¶: Given the state and action, return a list of possible next states and their probabilities. Only next states with nonzero probabilities will be returned :param state: start state :param action: action :return: a list of pairs (s’, p(s’|s,a))

render(mode='human')[source]¶

Renders the environment.

The set of supported modes varies per environment. (And some environments do not support rendering at all.) By convention, if mode is:

human: render to the current display or terminal and return nothing. Usually for human consumption.
rgb_array: Return an numpy.ndarray with shape (x, y, 3), representing RGB values for an x-by-y pixel image, suitable for turning into a video.
ansi: Return a string (str) or StringIO.StringIO containing a terminal-style text representation. The text can include newlines and ANSI escape sequences (e.g. for colors).

Note

Make sure that your class’s metadata ‘render.modes’ key includes: the list of supported modes. It’s recommended to call super() in implementations to use the functionality of this method.

Parameters:	mode (str) – the mode to render with

Example:

class MyEnv(Env):

metadata = {‘render.modes’: [‘human’, ‘rgb_array’]}

def render(self, mode=’human’):

if mode == ‘rgb_array’:: return np.array(…) # return RGB frame suitable for video
elif mode == ‘human’:: … # pop up a window and render
else:: super(MyEnv, self).render(mode=mode) # just raise an exception

Resets the state of the environment and returns an initial observation.

Returns:	the initial observation.
Return type:	observation (object)

step(action)[source]¶: action map: 0: left 1: down 2: right 3: up :param action: should be a one-hot vector encoding the action :return: