garage.sampler.default_worker
¶
Default Worker class.
-
class
DefaultWorker
(*, seed, max_episode_length, worker_number)¶ Bases:
garage.sampler.worker.Worker
Initialize a worker.
Parameters: - seed (int) – The seed to use to intialize random number generators.
- max_episode_length (int or float) – The maximum length of episodes which will be sampled. Can be (floating point) infinity.
- worker_number (int) – The number of the worker where this update is occurring. This argument is used to set a different seed for each worker.
-
env
¶ The worker’s environment.
Type: Environment or None
-
worker_init
(self)¶ Initialize a worker.
-
update_agent
(self, agent_update)¶ Update an agent, assuming it implements
Policy
.Parameters: agent_update (np.ndarray or dict or Policy) – If a tuple, dict, or np.ndarray, these should be parameters to agent, which should have been generated by calling Policy.get_param_values. Alternatively, a policy itself. Note that other implementations of Worker may take different types for this parameter.
-
update_env
(self, env_update)¶ Use any non-None env_update as a new environment.
A simple env update function. If env_update is not None, it should be the complete new environment.
This allows changing environments by passing the new environment as env_update into obtain_samples.
Parameters: env_update (Environment or EnvUpdate or None) – The environment to replace the existing env with. Note that other implementations of Worker may take different types for this parameter. Raises: TypeError
– If env_update is not one of the documented types.
-
start_episode
(self)¶ Begin a new episode.
-
step_episode
(self)¶ Take a single time-step in the current episode.
Returns: True iff the episode is done, either due to the environment indicating termination of due to reaching max_episode_length. Return type: bool
-
collect_episode
(self)¶ Collect the current episode, clearing the internal buffer.
Returns: - A batch of the episodes completed since the last call
- to collect_episode().
Return type: EpisodeBatch
-
rollout
(self)¶ Sample a single episode of the agent in the environment.
Returns: The collected episode. Return type: EpisodeBatch
-
shutdown
(self)¶ Close the worker’s environment.