garage.np.exploration_policies.exploration_policy
¶
Exploration Policy API used by off-policy algorithms.
-
class
ExplorationPolicy
(policy)[source]¶ Bases:
abc.ABC
Policy that wraps another policy to add action noise.
- Parameters
policy (garage.Policy) – Policy to wrap.
-
abstract
get_action
(self, observation)[source]¶ Return an action with noise.
- Parameters
observation (np.ndarray) – Observation from the environment.
- Returns
An action with noise. dict: Arbitrary policy state information (agent_info).
- Return type
np.ndarray
-
abstract
get_actions
(self, observations)[source]¶ Return actions with noise.
- Parameters
observations (np.ndarray) – Observation from the environment.
- Returns
Actions with noise. List[dict]: Arbitrary policy state information (agent_info).
- Return type
np.ndarray
-
update
(self, episode_batch)[source]¶ Update the exploration policy using a batch of trajectories.
- Parameters
episode_batch (EpisodeBatch) – A batch of trajectories which were sampled with this policy active.