garage.sampler.off_policy_vectorized_sampler module

This module implements a Vectorized Sampler used for OffPolicy Algorithms.

It diffs from OnPolicyVectorizedSampler in two parts:
  • The num of envs is defined by rollout_batch_size. In

OnPolicyVectorizedSampler, the number of envs can be decided by batch_size and max_path_length. But OffPolicy algorithms usually samples transitions from replay buffer, which only has buffer_batch_size. - It needs to add transitions to replay buffer throughout the rollout.

class OffPolicyVectorizedSampler(algo, env, n_envs=None, no_reset=True)[source]

Bases: garage.sampler.batch_sampler.BatchSampler

This class implements OffPolicyVectorizedSampler.

Parameters:
  • algo (garage.np.RLAlgorithm) – Algorithm.
  • env (garage.envs.GarageEnv) – Environment.
  • n_envs (int) – Number of parallel environments managed by sampler.
  • no_reset (bool) – Reset environment between samples or not.
obtain_samples(itr, batch_size)[source]

Collect samples for the given iteration number.

Parameters:
  • itr (int) – Iteration number.
  • batch_size (int) – Number of environment interactions in one batch.
Returns:

A list of paths.

Return type:

list

shutdown_worker()[source]

Terminate workers if necessary.

start_worker()[source]

Initialize the sampler.