`garage.replay_buffer.replay_buffer`¶

This module implements a replay buffer memory.

Replay buffer is an important technique in reinforcement learning. It stores transitions in a memory buffer of fixed size. When the buffer is full, oldest memory will be discarded. At each step, a batch of memories will be sampled from the buffer to update the agent’s parameters. In a word, replay buffer breaks temporal correlations and thus benefits RL algorithms.

class ReplayBuffer(env_spec, size_in_transitions, time_horizon)¶

Abstract class for Replay Buffer.

Parameters

env_spec (EnvSpec) – Environment specification.
size_in_transitions (int) – total size of transitions in the buffer
time_horizon (int) – time horizon of epsiode.

store_episode(self)¶: Add an episode to the buffer.

abstract sample(self, batch_size)¶

Sample a transition of batch_size.

Parameters: batch_size (int) – The number of transitions to be sampled.

add_transition(self, **kwargs)¶

Add one transition into the replay buffer.

Parameters: kwargs (dict(str, [numpy.ndarray])) – Dictionary that holds the transitions.

add_transitions(self, **kwargs)¶

Add multiple transitions into the replay buffer.

A transition contains one or multiple entries, e.g. observation, action, reward, terminal and next_observation. The same entry of all the transitions are stacked, e.g. {‘observation’: [obs1, obs2, obs3]} where obs1 is one numpy.ndarray observation from the environment.

Parameters: kwargs (dict(str, [numpy.ndarray])) – Dictionary that holds the transitions.

property full(self)¶

Whether the buffer is full.

Returns

True of the buffer has reachd its maximum size.: False otherwise.

Return type

bool

property n_transitions_stored(self)¶

Return the size of the replay buffer.

Returns: Size of the current replay buffer.
Return type: int

garage.replay_buffer.replay_buffer¶

`garage.replay_buffer.replay_buffer`¶