garage.tf.policies.discrete_qf_derived_policy module¶
A Discrete QFunction-derived policy.
This policy chooses the action that yields to the largest Q-value.
-
class
DiscreteQfDerivedPolicy
(env_spec, qf, name='DiscreteQfDerivedPolicy')[source]¶ Bases:
garage.tf.policies.policy.Policy
DiscreteQfDerived policy.
Parameters: - env_spec (garage.envs.env_spec.EnvSpec) – Environment specification.
- qf (garage.q_functions.QFunction) – The q-function used.
- name (str) – Name of the policy.
-
get_action
(observation)[source]¶ Get action from this policy for the input observation.
Parameters: observation (numpy.ndarray) – Observation from environment. Returns: Single optimal action from this policy. dict: Predicted action and agent information. It returns an empty dict since there is no parameterization.Return type: numpy.ndarray
-
get_actions
(observations)[source]¶ Get actions from this policy for the input observations.
Parameters: observations (numpy.ndarray) – Observations from environment. Returns: Optimal actions from this policy. dict: Predicted action and agent information. It returns an empty dict since there is no parameterization.Return type: numpy.ndarray
-
vectorized
¶ Vectorized or not.
Returns: True if primitive supports vectorized operations. Return type: Bool