garage.replay_buffer.base module¶
This module implements a replay buffer memory.
Replay buffer is an important technique in reinforcement learning. It stores transitions in a memory buffer of fixed size. When the buffer is full, oldest memory will be discarded. At each step, a batch of memories will be sampled from the buffer to update the agent’s parameters. In a word, replay buffer breaks temporal correlations and thus benefits RL algorithms.
-
class
ReplayBuffer
(env_spec, size_in_transitions, time_horizon)[source]¶ Bases:
object
Abstract class for Replay Buffer.
Parameters: - env_spec (garage.envs.EnvSpec) – Environment specification.
- size_in_transitions (int) – total size of transitions in the buffer
- time_horizon (int) – time horizon of rollout.
-
add_transitions
(**kwargs)[source]¶ Add multiple transitions into the replay buffer.
A transition contains one or multiple entries, e.g. observation, action, reward, terminal and next_observation. The same entry of all the transitions are stacked, e.g. {‘observation’: [obs1, obs2, obs3]} where obs1 is one numpy.ndarray observation from the environment.
Parameters: kwargs (dict(str, [numpy.ndarray])) – Dictionary that holds the transitions.
-
full
¶ Whether the buffer is full.
-
n_transitions_stored
¶ Return the size of the replay buffer.
Returns: Size of the current replay buffer. Return type: self._size