garage.torch.policies.policy module

Base Policy.

class Policy(env_spec, name)[source]

Bases: sphinx.ext.autodoc.importer._MockObject, abc.ABC

Policy base class.

Parameters:
action_space

The action space for the environment.

Returns:Action space.
Return type:akro.Space
get_action(observation)[source]

Get a single action given an observation.

Parameters:observation (torch.Tensor) – Observation from the environment.
Returns:
  • torch.Tensor: Predicted action.
  • dict:
    • list[float]: Mean of the distribution
    • list[float]: Log of standard deviation of the
      distribution
Return type:tuple
get_actions(observations)[source]

Get actions given observations.

Parameters:observations (torch.Tensor) – Observations from the environment.
Returns:
  • torch.Tensor: Predicted actions.
  • dict:
    • list[float]: Mean of the distribution
    • list[float]: Log of standard deviation of the
      distribution
Return type:tuple
get_param_values()[source]

Get the parameters to the policy.

This method is included to ensure consistency with TF policies.

Returns:The parameters (in the form of the state dictionary).
Return type:dict
name

Name of policy.

Returns:Name of policy
Return type:str
observation_space

The observation space for the environment.

Returns:Observation space.
Return type:akro.Space
reset(dones=None)[source]

Reset the environment.

Parameters:dones (numpy.ndarray) – Reset values
set_param_values(state_dict)[source]

Set the parameters to the policy.

This method is included to ensure consistency with TF policies.

Parameters:state_dict (dict) – State dictionary.