Worker that “vectorizes” environments.
VecWorker(*, seed, max_episode_length, worker_number, n_envs=DEFAULT_N_ENVS)¶
Worker with a single policy and multiple environments.
Alternates between taking a single step in all environments and asking the policy for an action for every environment. This allows computing a batch of actions, which is generally much more efficient than computing a single action when using neural networks.
seed (int) – The seed to use to intialize random number generators.
worker_number (int) – The number of the worker this update is occurring in. This argument is used set a different seed for each worker.
n_envs (int) – Number of environment copies to use.
Update an agent, assuming it implements
agent_update (np.ndarray or dict or Policy) – If a tuple, dict, or np.ndarray, these should be parameters to agent, which should have been generated by calling Policy.get_param_values. Alternatively, a policy itself. Note that other implementations of Worker may take different types for this parameter.
Update the environments.
If passed a list (inside this list passed to the Sampler itself), distributes the environments across the “vectorization” dimension.
Begin a new episode.
Take a single time-step in the current episode.
True iff at least one of the episodes was completed.
- Return type
Collect all completed episodes.
- A batch of the episodes completed since the last call
- Return type
Close the worker’s environments.
Initialize a worker.