garage.np.exploration_policies.exploration_policy
¶
Exploration Policy API used by off-policy algorithms.
- class ExplorationPolicy(policy)[source]¶
Bases:
abc.ABC
Policy that wraps another policy to add action noise.
- Parameters
policy (garage.Policy) – Policy to wrap.
- abstract get_action(observation)[source]¶
Return an action with noise.
- Parameters
observation (np.ndarray) – Observation from the environment.
- Returns
An action with noise. dict: Arbitrary policy state information (agent_info).
- Return type
np.ndarray
- abstract get_actions(observations)[source]¶
Return actions with noise.
- Parameters
observations (np.ndarray) – Observation from the environment.
- Returns
Actions with noise. List[dict]: Arbitrary policy state information (agent_info).
- Return type
np.ndarray
- reset(dones=None)[source]¶
Reset the state of the exploration.
- Parameters
dones (List[bool] or numpy.ndarray or None) – Which vectorization states to reset.
- update(episode_batch)[source]¶
Update the exploration policy using a batch of trajectories.
- Parameters
episode_batch (EpisodeBatch) – A batch of trajectories which were sampled with this policy active.