garage.replay_buffer.her_replay_buffer

This module implements a Hindsight Experience Replay (HER).

See: https://arxiv.org/abs/1707.01495.

class HERReplayBuffer(replay_k, reward_fn, capacity_in_transitions, env_spec)

Bases: garage.replay_buffer.path_buffer.PathBuffer

Inheritance diagram of garage.replay_buffer.her_replay_buffer.HERReplayBuffer

Replay buffer for HER (Hindsight Experience Replay).

It constructs hindsight examples using future strategy.

Parameters
  • replay_k (int) – Number of HER transitions to add for each regular Transition. Setting this to 0 means that no HER replays will be added.

  • reward_fn (callable) – Function to re-compute the reward with substituted goals.

  • capacity_in_transitions (int) – total size of transitions in the buffer.

  • env_spec (EnvSpec) – Environment specification.

add_path(self, path)

Adds a path to the replay buffer.

For each transition in the given path except the last one, replay_k HER transitions will added to the buffer in addition to the one in the path. The last transition is added without sampling additional HER goals.

Parameters

path (dict[str, np.ndarray]) – Each key in the dict must map to a np.ndarray of shape \((T, S^*)\).

add_episode_batch(self, episodes)

Add a EpisodeBatch to the buffer.

Parameters

episodes (EpisodeBatch) – Episodes to add.

sample_path(self)

Sample a single path from the buffer.

Returns

A dict of arrays of shape (path_len, flat_dim).

Return type

path

sample_transitions(self, batch_size)

Sample a batch of transitions from the buffer.

Parameters

batch_size (int) – Number of transitions to sample.

Returns

A dict of arrays of shape (batch_size, flat_dim).

Return type

dict

sample_timesteps(self, batch_size)

Sample a batch of timesteps from the buffer.

Parameters

batch_size (int) – Number of timesteps to sample.

Returns

The batch of timesteps.

Return type

TimeStepBatch

clear(self)

Clear buffer.

property n_transitions_stored(self)

Return the size of the replay buffer.

Returns

Size of the current replay buffer.

Return type

int