garage.np.policies.uniform_random_policy

Uniform random exploration strategy.

class UniformRandomPolicy(env_spec)[source]

Bases: garage.np.policies.policy.Policy

Inheritance diagram of garage.np.policies.uniform_random_policy.UniformRandomPolicy

Action taken is uniformly random.

Parameters

env_spec (EnvSpec) – Environment spec to explore.

property name

Name of policy.

Returns

Name of policy

Return type

str

property env_spec

Policy environment specification.

Returns

Environment specification.

Return type

garage.EnvSpec

property observation_space

Observation space.

Returns

The observation space of the environment.

Return type

akro.Space

property action_space

Action space.

Returns

The action space of the environment.

Return type

akro.Space

reset(do_resets=None)[source]

Reset the state of the exploration.

Parameters

do_resets (List[bool] or numpy.ndarray or None) – Which vectorization states to reset.

get_action(observation)[source]

Get action from this policy for the input observation.

Parameters

observation (numpy.ndarray) – Observation from the environment.

Returns

Actions with noise. List[dict]: Arbitrary policy state information (agent_info).

Return type

np.ndarray

get_actions(observations)[source]

Get actions from this policy for the input observation.

Parameters

observations (list) – Observations from the environment.

Returns

Actions with noise. List[dict]: Arbitrary policy state information (agent_info).

Return type

np.ndarray