garage.sampler.on_policy_vectorized_sampler module

BatchSampler which uses VecEnvExecutor to run multiple environments.

class OnPolicyVectorizedSampler(algo, env, n_envs=None)[source]

Bases: garage.sampler.batch_sampler.BatchSampler

BatchSampler which uses VecEnvExecutor to run multiple environments.

Parameters:
obtain_samples(itr, batch_size=None, whole_paths=True)[source]

Sample the policy for new trajectories.

Parameters:
  • itr (int) – Iteration number.
  • batch_size (int) – Number of samples to be collected. If None, it will be default [algo.max_path_length * n_envs].
  • whole_paths (bool) – Whether return all the paths or not. True by default. It’s possible for the paths to have total actual sample size larger than batch_size, and will be truncated if this flag is true.
Returns:

Sample paths.

Return type:

list[dict]

Note

Each path is a dictionary, with keys and values as following:
  • observations: numpy.ndarray with shape [Batch, *obs_dims]
  • actions: numpy.ndarray with shape [Batch, *act_dims]
  • rewards: numpy.ndarray with shape [Batch, ]
  • env_infos: A dictionary with each key representing one environment info, value being a numpy.ndarray with shape [Batch, ?]. One example is “ale.lives” for atari environments.
  • agent_infos: A dictionary with each key representing one agent info, value being a numpy.ndarray with shape [Batch, ?]. One example is “prev_action”, which is used for recurrent policy as previous action input, merged with the observation input as the state input.
  • dones: numpy.ndarray with shape [Batch, ]
shutdown_worker()[source]

Shutdown workers.

start_worker()[source]

Start workers.