garage.tf.algos.te
¶
Task Embedding Algorithm.
- class TaskEmbeddingWorker(*, seed, max_episode_length, worker_number)¶
Bases:
garage.sampler.DefaultWorker
A sampler worker for Task Embedding Algorithm.
In addition to DefaultWorker, this worker adds one-hot task id to env_info, and adds latent and latent infos to agent_info.
- Parameters
seed (int) – The seed to use to intialize random number generators.
max_episode_length (int or float) – The maximum length of episodes to sample. Can be (floating point) infinity.
worker_number (int) – The number of the worker where this update is occurring. This argument is used to set a different seed for each worker.
- env¶
The worker’s environment.
- Type
Environment or None
- start_episode()¶
Begin a new episode.
- step_episode()¶
Take a single time-step in the current episode.
- Returns
- True iff the episode is done, either due to the environment
indicating termination of due to reaching max_episode_length.
- Return type
- collect_episode()¶
Collect the current episode, clearing the internal buffer.
One-hot task id is saved in env_infos[‘task_onehot’]. Latent is saved in agent_infos[‘latent’]. Latent infos are saved in agent_infos[‘latent_info_name’], where info_name is the original latent info name.
- Returns
- A batch of the episodes completed since the last call
to collect_episode().
- Return type
- worker_init()¶
Initialize a worker.
- update_agent(agent_update)¶
Update an agent, assuming it implements
Policy
.- Parameters
agent_update (np.ndarray or dict or Policy) – If a tuple, dict, or np.ndarray, these should be parameters to agent, which should have been generated by calling Policy.get_param_values. Alternatively, a policy itself. Note that other implementations of Worker may take different types for this parameter.
- update_env(env_update)¶
Use any non-None env_update as a new environment.
A simple env update function. If env_update is not None, it should be the complete new environment.
This allows changing environments by passing the new environment as env_update into obtain_samples.
- Parameters
env_update (Environment or EnvUpdate or None) – The environment to replace the existing env with. Note that other implementations of Worker may take different types for this parameter.
- Raises
TypeError – If env_update is not one of the documented types.
- rollout()¶
Sample a single episode of the agent in the environment.
- Returns
The collected episode.
- Return type
- shutdown()¶
Close the worker’s environment.