garage.torch.policies.policy module¶

Base Policy.

class Policy(env_spec, name)[source]¶

Bases: sphinx.ext.autodoc.importer._MockObject, abc.ABC

Policy base class.

Parameters:	env_spec (garage.envs.env_spec.EnvSpec) – Environment specification. name (str) – Name of policy.

action_space¶

The action space for the environment.

Returns:	Action space.
Return type:	akro.Space

get_action(observation)[source]¶

Get a single action given an observation.

Parameters:	observation (torch.Tensor) – Observation from the environment.
Returns:	torch.Tensor: Predicted action. dict: list[float]: Mean of the distribution list[float]: Log of standard deviation of the distribution
Return type:	tuple

get_actions(observations)[source]¶

Get actions given observations.

Parameters:	observations (torch.Tensor) – Observations from the environment.
Returns:	torch.Tensor: Predicted actions. dict: list[float]: Mean of the distribution list[float]: Log of standard deviation of the distribution
Return type:	tuple

get_param_values()[source]¶

Get the parameters to the policy.

This method is included to ensure consistency with TF policies.

Returns:	The parameters (in the form of the state dictionary).
Return type:	dict

name¶

Name of policy.

Returns:	Name of policy
Return type:	str

observation_space¶

The observation space for the environment.

Returns:	Observation space.
Return type:	akro.Space

reset(dones=None)[source]¶

Reset the environment.

Parameters:	dones (numpy.ndarray) – Reset values

set_param_values(state_dict)[source]¶

Set the parameters to the policy.

This method is included to ensure consistency with TF policies.

Parameters:	state_dict (dict) – State dictionary.