garage.np.exploration_policies.exploration_policy

Exploration Policy API used by off-policy algorithms.

class ExplorationPolicy(policy)[source]

Bases: abc.ABC

Inheritance diagram of garage.np.exploration_policies.exploration_policy.ExplorationPolicy

Policy that wraps another policy to add action noise.

Parameters

policy (garage.Policy) – Policy to wrap.

abstract get_action(self, observation)[source]

Return an action with noise.

Parameters

observation (np.ndarray) – Observation from the environment.

Returns

An action with noise. dict: Arbitrary policy state information (agent_info).

Return type

np.ndarray

abstract get_actions(self, observations)[source]

Return actions with noise.

Parameters

observations (np.ndarray) – Observation from the environment.

Returns

Actions with noise. List[dict]: Arbitrary policy state information (agent_info).

Return type

np.ndarray

reset(self, dones=None)[source]

Reset the state of the exploration.

Parameters

dones (List[bool] or numpy.ndarray or None) – Which vectorization states to reset.

update(self, episode_batch)[source]

Update the exploration policy using a batch of trajectories.

Parameters

episode_batch (EpisodeBatch) – A batch of trajectories which were sampled with this policy active.

get_param_values(self)[source]

Get parameter values.

Returns

Values of each parameter.

Return type

list or dict

set_param_values(self, params)[source]

Set param values.

Parameters

params (np.ndarray) – A numpy array of parameter values.