garage.tf package

Tensorflow Branch.

paths_to_tensors(paths, max_path_length, baseline_predictions, discount, gae_lambda)[source]

Return processed sample data based on the collected paths.

Parameters:
  • paths (list[dict]) – A list of collected paths.
  • max_path_length (int) – Maximum length of a single rollout.
  • baseline_predictions (numpy.ndarray) – : Predicted value of GAE (Generalized Advantage Estimation) Baseline.
  • discount (float) – Environment reward discount.
  • gae_lambda (float) – Lambda used for generalized advantage estimation.
Returns:

Processed sample data, with key
  • observations: (numpy.ndarray)
  • actions: (numpy.ndarray)
  • rewards: (numpy.ndarray)
  • baselines: (numpy.ndarray)
  • returns: (numpy.ndarray)
  • valids: (numpy.ndarray)
  • agent_infos: (dict)
  • env_infos: (dict)
  • paths: (list[dict])

Return type:

dict

Subpackages