garage.experiment

Experiment functions.

class MetaEvaluator(*, test_task_sampler, n_exploration_eps=10, n_test_tasks=None, n_test_episodes=1, prefix='MetaTest', test_task_names=None, worker_class=DefaultWorker, worker_args=None)

Evaluates Meta-RL algorithms on test environments.

Parameters
  • test_task_sampler (TaskSampler) – Sampler for test tasks. To demonstrate the effectiveness of a meta-learning method, these should be different from the training tasks.

  • n_test_tasks (int or None) – Number of test tasks to sample each time evaluation is performed. Note that tasks are sampled “without replacement”. If None, is set to test_task_sampler.n_tasks.

  • n_exploration_eps (int) – Number of episodes to gather from the exploration policy before requesting the meta algorithm to produce an adapted policy.

  • n_test_episodes (int) – Number of episodes to use for each adapted policy. The adapted policy should forget previous episodes when .reset() is called.

  • prefix (str) – Prefix to use when logging. Defaults to MetaTest. For example, this results in logging the key ‘MetaTest/SuccessRate’. If not set to MetaTest, it should probably be set to MetaTrain.

  • test_task_names (list[str]) – List of task names to test. Should be in an order consistent with the task_id env_info, if that is present.

  • worker_class (type) – Type of worker the Sampler should use.

  • worker_args (dict or None) – Additional arguments that should be passed to the worker.

evaluate(self, algo, test_episodes_per_task=None)

Evaluate the Meta-RL algorithm on the test tasks.

Parameters
  • algo (MetaRLAlgorithm) – The algorithm to evaluate.

  • test_episodes_per_task (int or None) – Number of episodes per task.

SnapshotConfig
class Snapshotter(snapshot_dir=os.path.join(os.getcwd(), 'data/local/experiment'), snapshot_mode='last', snapshot_gap=1)

Snapshotter snapshots training data.

When training, it saves data to binary files. When resuming, it loads from saved data.

Parameters
  • snapshot_dir (str) – Path to save the log and iteration snapshot.

  • snapshot_mode (str) – Mode to save the snapshot. Can be either “all” (all iterations will be saved), “last” (only the last iteration will be saved), “gap” (every snapshot_gap iterations are saved), “gap_and_last” (save the last iteration as ‘params.pkl’ and save every snapshot_gap iteration separately), “gap_overwrite” (same as gap but overwrites the last saved snapshot), or “none” (do not save snapshots).

  • snapshot_gap (int) – Gap between snapshot iterations. Wait this number of iterations before taking another snapshot.

property snapshot_dir(self)

Return the directory of snapshot.

Returns

The directory of snapshot

Return type

str

property snapshot_mode(self)

Return the type of snapshot.

Returns

The type of snapshot. Can be “all”, “last”, “gap”,

”gap_overwrite”, “gap_and_last”, or “none”.

Return type

str

property snapshot_gap(self)

Return the gap number of snapshot.

Returns

The gap number of snapshot.

Return type

int

save_itr_params(self, itr, params)

Save the parameters if at the right iteration.

Parameters
  • itr (int) – Number of iterations. Used as the index of snapshot.

  • params (obj) – Content of snapshot to be saved.

Raises

ValueError – If snapshot_mode is not one of “all”, “last”, “gap”, “gap_overwrite”, “gap_and_last”, or “none”.

load(self, load_dir, itr='last')

Load one snapshot of parameters from disk.

Parameters
  • load_dir (str) – Directory of the cloudpickle file to resume experiment from.

  • itr (int or string) – Iteration to load. Can be an integer, ‘last’ or ‘first’.

Returns

Loaded snapshot.

Return type

dict

Raises
  • ValueError – If itr is neither an integer nor one of (“last”, “first”).

  • FileNotFoundError – If the snapshot file is not found in load_dir.

  • NotAFileError – If the snapshot exists but is not a file.

class ConstructEnvsSampler(env_constructors)

Bases: garage.experiment.task_sampler.TaskSampler

Inheritance diagram of garage.experiment.ConstructEnvsSampler

TaskSampler where each task has its own constructor.

Generally, this is used when the different tasks are completely different environments.

Parameters

env_constructors (list[Callable[Environment]]) – Callables that produce environments (for example, environment types).

property n_tasks(self)

int: the number of tasks.

sample(self, n_tasks, with_replacement=False)

Sample a list of environment updates.

Parameters
  • n_tasks (int) – Number of updates to sample.

  • with_replacement (bool) – Whether tasks can repeat when sampled. Note that if more tasks are sampled than exist, then tasks may repeat, but only after every environment has been included at least once in this batch. Ignored for continuous task spaces.

Returns

Batch of sampled environment updates, which, when

invoked on environments, will configure them with new tasks. See EnvUpdate for more information.

Return type

list[EnvUpdate]

class EnvPoolSampler(envs)

Bases: garage.experiment.task_sampler.TaskSampler

Inheritance diagram of garage.experiment.EnvPoolSampler

TaskSampler that samples from a finite pool of environments.

This can be used with any environments, but is generally best when using in-process samplers with environments that are expensive to construct.

Parameters

envs (list[Environment]) – List of environments to use as a pool.

property n_tasks(self)

int: the number of tasks.

sample(self, n_tasks, with_replacement=False)

Sample a list of environment updates.

Parameters
  • n_tasks (int) – Number of updates to sample.

  • with_replacement (bool) – Whether tasks can repeat when sampled. Since this cannot be easily implemented for an object pool, setting this to True results in ValueError.

Raises

ValueError – If the number of requested tasks is larger than the pool, or with_replacement is set.

Returns

Batch of sampled environment updates, which, when

invoked on environments, will configure them with new tasks. See EnvUpdate for more information.

Return type

list[EnvUpdate]

grow_pool(self, new_size)

Increase the size of the pool by copying random tasks in it.

Note that this only copies the tasks already in the pool, and cannot create new original tasks in any way.

Parameters

new_size (int) – Size the pool should be after growning.

class MetaWorldTaskSampler(benchmark, kind, wrapper=None, add_env_onehot=False)

Bases: garage.experiment.task_sampler.TaskSampler

Inheritance diagram of garage.experiment.MetaWorldTaskSampler

TaskSampler that distributes a Meta-World benchmark across workers.

Parameters
  • benchmark (metaworld.Benchmark) – Benchmark to sample tasks from.

  • kind (str) – Must be either ‘test’ or ‘train’. Determines whether to sample training or test tasks from the Benchmark.

  • wrapper (Callable[garage.Env, garage.Env] or None) – Wrapper to apply to env instances.

  • add_env_onehot (bool) – If true, a one-hot representing the current environment name will be added to the environments. Should only be used with multi-task benchmarks.

Raises

ValueError – If kind is not ‘train’ or ‘test’. Also raisd if add_env_onehot is used on a metaworld meta learning (not multi-task) benchmark.

property n_tasks(self)

int: the number of tasks.

sample(self, n_tasks, with_replacement=False)

Sample a list of environment updates.

Note that this will always return environments in the same order, to make parallel sampling across workers efficient. If randomizing the environment order is required, shuffle the result of this method.

Parameters
  • n_tasks (int) – Number of updates to sample. Must be a multiple of the number of env classes in the benchmark (e.g. 1 for MT/ML1, 10 for MT10, 50 for MT50). Tasks for each environment will be grouped to be adjacent to each other.

  • with_replacement (bool) – Whether tasks can repeat when sampled. Since this cannot be easily implemented for an object pool, setting this to True results in ValueError.

Raises

ValueError – If the number of requested tasks is not equal to the number of classes or the number of total tasks.

Returns

Batch of sampled environment updates, which, when

invoked on environments, will configure them with new tasks. See EnvUpdate for more information.

Return type

list[EnvUpdate]

class SetTaskSampler(env_constructor, *, env=None, wrapper=None)

Bases: garage.experiment.task_sampler.TaskSampler

Inheritance diagram of garage.experiment.SetTaskSampler

TaskSampler where the environment can sample “task objects”.

This is used for environments that implement sample_tasks and set_task. For example, HalfCheetahVelEnv, as implemented in Garage.

Parameters
  • env_constructor (type) – Type of the environment.

  • env (garage.Environment or None) – Instance of env_constructor to sample from (will be constructed if not provided).

  • wrapper (Callable[garage.Environment, garage.Environment] or None) – Wrapper function to apply to environment.

property n_tasks(self)

int or None: The number of tasks if known and finite.

sample(self, n_tasks, with_replacement=False)

Sample a list of environment updates.

Parameters
  • n_tasks (int) – Number of updates to sample.

  • with_replacement (bool) – Whether tasks can repeat when sampled. Note that if more tasks are sampled than exist, then tasks may repeat, but only after every environment has been included at least once in this batch. Ignored for continuous task spaces.

Returns

Batch of sampled environment updates, which, when

invoked on environments, will configure them with new tasks. See EnvUpdate for more information.

Return type

list[EnvUpdate]

class TaskSampler

Bases: abc.ABC

Inheritance diagram of garage.experiment.TaskSampler

Class for sampling batches of tasks, represented as `~EnvUpdate`s.

n_tasks

Number of tasks, if known and finite.

Type

int or None

abstract sample(self, n_tasks, with_replacement=False)

Sample a list of environment updates.

Parameters
  • n_tasks (int) – Number of updates to sample.

  • with_replacement (bool) – Whether tasks can repeat when sampled. Note that if more tasks are sampled than exist, then tasks may repeat, but only after every environment has been included at least once in this batch. Ignored for continuous task spaces.

Returns

Batch of sampled environment updates, which, when

invoked on environments, will configure them with new tasks. See EnvUpdate for more information.

Return type

list[EnvUpdate]

property n_tasks(self)

int or None: The number of tasks if known and finite.