`garage.np.exploration_policies.exploration_policy`¶

Exploration Policy API used by off-policy algorithms.

class ExplorationPolicy(policy)[source]¶

Bases: abc.ABC

Policy that wraps another policy to add action noise.

abstract get_action(observation)[source]¶

Return an action with noise.

Parameters: observation (np.ndarray) – Observation from the environment.
Returns: An action with noise. dict: Arbitrary policy state information (agent_info).
Return type: np.ndarray

abstract get_actions(observations)[source]¶

Return actions with noise.

Parameters: observations (np.ndarray) – Observation from the environment.
Returns: Actions with noise. List[dict]: Arbitrary policy state information (agent_info).
Return type: np.ndarray

reset(dones=None)[source]¶

Reset the state of the exploration.

Parameters: dones (List[bool] or numpy.ndarray or None) – Which vectorization states to reset.

update(episode_batch)[source]¶

Update the exploration policy using a batch of trajectories.

Parameters: episode_batch (EpisodeBatch) – A batch of trajectories which were sampled with this policy active.

get_param_values()[source]¶

Get parameter values.

set_param_values(params)[source]¶

Set param values.

garage.np.exploration_policies.exploration_policy¶