garage.replay_buffer.her_replay_buffer
¶
This module implements a Hindsight Experience Replay (HER).
See: https://arxiv.org/abs/1707.01495.
-
class
HERReplayBuffer
(replay_k, reward_fn, capacity_in_transitions, env_spec)¶ Bases:
garage.replay_buffer.path_buffer.PathBuffer
Replay buffer for HER (Hindsight Experience Replay).
It constructs hindsight examples using future strategy.
Parameters: - replay_k (int) – Number of HER transitions to add for each regular Transition. Setting this to 0 means that no HER replays will be added.
- reward_fn (callable) – Function to re-compute the reward with substituted goals.
- capacity_in_transitions (int) – total size of transitions in the buffer.
- env_spec (EnvSpec) – Environment specification.
-
n_transitions_stored
¶ Return the size of the replay buffer.
Returns: Size of the current replay buffer. Return type: int
-
add_path
(self, path)¶ Adds a path to the replay buffer.
For each transition in the given path except the last one, replay_k HER transitions will added to the buffer in addition to the one in the path. The last transition is added without sampling additional HER goals.
Parameters: path (dict[str, np.ndarray]) – Each key in the dict must map to a np.ndarray of shape \((T, S^*)\).
-
add_episode_batch
(self, episodes)¶ Add a EpisodeBatch to the buffer.
Parameters: episodes (EpisodeBatch) – Episodes to add.
-
sample_path
(self)¶ Sample a single path from the buffer.
Returns: A dict of arrays of shape (path_len, flat_dim). Return type: path
-
sample_transitions
(self, batch_size)¶ Sample a batch of transitions from the buffer.
Parameters: batch_size (int) – Number of transitions to sample. Returns: A dict of arrays of shape (batch_size, flat_dim). Return type: dict
-
clear
(self)¶ Clear buffer.