garage.sampler.is_sampler module¶
Importance sampling sampler.
-
class
ISSampler
(algo, env, n_backtrack=None, n_is_pretrain=0, init_is=0, skip_is_itrs=False, hist_variance_penalty=0.0, max_is_ratio=0, ess_threshold=0, randomize_draw=False)[source]¶ Bases:
garage.sampler.batch_sampler.BatchSampler
Importance sampling sampler.
Sampler which alternates between live sampling iterations using BatchSampler and importance sampling iterations.
Parameters: - algo (garage.np.algos.RLAlgorithm) – An algorithm instance.
- env (garage.envs.GarageEnv) – An environement instance.
- n_backtrack (int) – Number of past policies to update from. If None, it uses all past policies.
- n_is_pretrain (int) – Number of importance sampling iterations to perform in beginning of training
- init_is (bool) – Set initial iteration (after pretrain) an importance sampling iteration.
- skip_is_itrs (bool) – Do not do any importance sampling iterations (after pretrain).
- hist_variance_penalty (int) – Penalize variance of historical policy.
- max_is_ratio (int) – Maximum allowed importance sampling ratio.
- ess_threshold (int) – Minimum effective sample size required.
- randomize_draw (bool) – Whether to randomize important samples.
-
add_history
(policy_distribution, paths)[source]¶ Store policy distribution and paths in history.
Parameters: - policy_distribution (garage.tf.distributions.Distribution) – Policy distribution. # noqa: E501
- paths (list) – Paths.
-
get_history_list
(n_past=None)[source]¶ Get list of (distribution, data) tuples from history.
Parameters: n_past (int) – Number of past policies to update from. If None, it uses all past policies. Returns: A list of paths. Return type: list
-
history
¶ History of policies.
History of policies that have interacted with the environment and the data from interaction episode(s).
Type: list