garage.experiment package¶
Experiment functions.
-
run_experiment
(method_call=None, batch_tasks=None, exp_prefix='experiment', exp_name=None, log_dir=None, script='garage.experiment.experiment_wrapper', python_command='python', dry=False, env=None, variant=None, force_cpu=False, pre_commands=None, **kwargs)[source]¶ Serialize the method call and run the experiment using specified mode.
Parameters: - method_call (callable) – A method call.
- batch_tasks (list[dict]) – A batch of method calls.
- exp_prefix (str) – Name prefix for the experiment.
- exp_name (str) – Name of the experiment.
- log_dir (str) – Log directory for the experiment.
- script (str) – The name of the entrance point python script.
- python_command (str) – Python command to run the experiment.
- dry (bool) – Whether to do a dry-run, which only prints the commands without executing them.
- env (dict) – Extra environment variables.
- variant (dict) – If provided, should be a dictionary of parameters.
- force_cpu (bool) – Whether to set all GPU devices invisible to force use CPU.
- pre_commands (str) – Pre commands to run the experiment.
- kwargs (dict) – Additional parameters.
-
to_local_command
(params, python_command='python', script='garage.experiment.experiment_wrapper')[source]¶
-
class
LocalRunner
(snapshot_config, max_cpus=1)[source]¶ Bases:
object
Base class of local runner.
Use Runner.setup(algo, env) to setup algorithm and environement for runner and Runner.train() to start training.
Parameters: - snapshot_config (garage.experiment.SnapshotConfig) – The snapshot configuration used by LocalRunner to create the snapshotter. If None, it will create one with default settings.
- max_cpus (int) – The maximum number of parallel sampler workers.
Note
For the use of any TensorFlow environments, policies and algorithms, please use LocalTFRunner().
Examples
# to trainrunner = LocalRunner()env = Env(…)policy = Policy(…)algo = Algo(env=env,policy=policy,…)runner.setup(algo, env)runner.train(n_epochs=100, batch_size=4000)# to resume immediately.runner = LocalRunner()runner.restore(resume_from_dir)runner.resume()# to resume with modified training arguments.runner = LocalRunner()runner.restore(resume_from_dir)runner.resume(n_epochs=20)-
get_env_copy
()[source]¶ Get a copy of the environment.
Returns: An environement instance. Return type: garage.envs.GarageEnv
-
log_diagnostics
(pause_for_plot=False)[source]¶ Log diagnostics.
Parameters: pause_for_plot (bool) – Pause for plot.
-
make_sampler
(sampler_cls, *, seed=None, n_workers=1, max_path_length=None, worker_class=<class 'garage.sampler.default_worker.DefaultWorker'>, sampler_args=None, worker_args=None)[source]¶ Construct a Sampler from a Sampler class.
Parameters: - sampler_cls (type) – The type of sampler to construct.
- seed (int) – Seed to use in sampler workers.
- max_path_length (int) – Maximum path length to be sampled by the sampler. Paths longer than this will be truncated.
- n_workers (int) – The number of workers the sampler should use.
- worker_class (type) – Type of worker the Sampler should use.
- sampler_args (dict or None) – Additional arguments that should be passed to the sampler.
- worker_args (dict or None) – Additional arguments that should be passed to the sampler.
Raises: ValueError
– If max_path_length isn’t passed and the algorithm doesn’t contain a max_path_length field, or if the algorithm doesn’t have a policy field.Returns: An instance of the sampler class.
Return type: sampler_cls
-
obtain_samples
(itr, batch_size=None, agent_update=None, env_update=None)[source]¶ Obtain one batch of samples.
Parameters: - itr (int) – Index of iteration (epoch).
- batch_size (int) – Number of steps in batch. This is a hint that the sampler may or may not respect.
- agent_update (object) – Value which will be passed into the agent_update_fn before doing rollouts. If a list is passed in, it must have length exactly factory.n_workers, and will be spread across the workers.
- env_update (object) – Value which will be passed into the env_update_fn before doing rollouts. If a list is passed in, it must have length exactly factory.n_workers, and will be spread across the workers.
Raises: ValueError
– Raised if the runner was initialized without a sampler, or batch_size wasn’t provided here or to train.Returns: One batch of samples.
Return type:
-
restore
(from_dir, from_epoch='last')[source]¶ Restore experiment from snapshot.
Parameters: Returns: Arguments for train().
Return type:
-
resume
(n_epochs=None, batch_size=None, plot=None, store_paths=None, pause_for_plot=None)[source]¶ Resume from restored experiment.
This method provides the same interface as train().
If not specified, an argument will default to the saved arguments from the last call to train().
Parameters: Raises: NotSetupError
– If resume() is called before restore().Returns: The average return in last epoch cycle.
Return type:
-
save
(epoch)[source]¶ Save snapshot of current batch.
Parameters: epoch (int) – Epoch. Raises: NotSetupError
– if save() is called before the runner is set up.
-
setup
(algo, env, sampler_cls=None, sampler_args=None, n_workers=1, worker_class=<class 'garage.sampler.default_worker.DefaultWorker'>, worker_args=None)[source]¶ Set up runner for algorithm and environment.
This method saves algo and env within runner and creates a sampler.
Note
After setup() is called all variables in session should have been initialized. setup() respects existing values in session so policy weights can be loaded before setup().
Parameters: - algo (garage.np.algos.RLAlgorithm) – An algorithm instance.
- env (garage.envs.GarageEnv) – An environement instance.
- sampler_cls (garage.sampler.Sampler) – A sampler class.
- sampler_args (dict) – Arguments to be passed to sampler constructor.
- n_workers (int) – The number of workers the sampler should use.
- worker_class (type) – Type of worker the sampler should use.
- worker_args (dict or None) – Additional arguments that should be passed to the worker.
Raises: ValueError
– If sampler_cls is passed and the algorithm doesn’t contain a max_path_length field.
-
step_epochs
()[source]¶ Step through each epoch.
This function returns a magic generator. When iterated through, this generator automatically performs services such as snapshotting and log management. It is used inside train() in each algorithm.
The generator initializes two variables: self.step_itr and self.step_path. To use the generator, these two have to be updated manually in each epoch, as the example shows below.
Yields: int – The next training epoch. Examples
- for epoch in runner.step_epochs():
- runner.step_path = runner.obtain_samples(…) self.train_once(…) runner.step_itr += 1
-
total_env_steps
¶ Total environment steps collected.
Returns: Total environment steps collected. Return type: int
-
class
LocalTFRunner
(snapshot_config, sess=None, max_cpus=1)[source]¶ Bases:
garage.experiment.local_runner.LocalRunner
This class implements a local runner for TensorFlow algorithms.
A local runner provides a default TensorFlow session using python context. This is useful for those experiment components (e.g. policy) that require a TensorFlow session during construction.
Use Runner.setup(algo, env) to setup algorithm and environement for runner and Runner.train() to start training.
Parameters: - snapshot_config (garage.experiment.SnapshotConfig) – The snapshot configuration used by LocalRunner to create the snapshotter. If None, it will create one with default settings.
- max_cpus (int) – The maximum number of parallel sampler workers.
- sess (tf.Session) – An optional TensorFlow session. A new session will be created immediately if not provided.
Note
The local runner will set up a joblib task pool of size max_cpus possibly later used by BatchSampler. If BatchSampler is not used, the processes in the pool will remain dormant.
This setup is required to use TensorFlow in a multiprocess environment before a TensorFlow session is created because TensorFlow is not fork-safe. See https://github.com/tensorflow/tensorflow/issues/2448.
When resume via command line, new snapshots will be saved into the SAME directory if not specified.
When resume programmatically, snapshot directory should be specify manually or through run_experiment() interface.
Examples
# to train with LocalTFRunner() as runner:
env = gym.make(‘CartPole-v1’) policy = CategoricalMLPPolicy(
env_spec=env.spec, hidden_sizes=(32, 32))- algo = TRPO(
- env=env, policy=policy, baseline=baseline, max_path_length=100, discount=0.99, max_kl_step=0.01)
runner.setup(algo, env) runner.train(n_epochs=100, batch_size=4000)
# to resume immediately. with LocalTFRunner() as runner:
runner.restore(resume_from_dir) runner.resume()# to resume with modified training arguments. with LocalTFRunner() as runner:
runner.restore(resume_from_dir) runner.resume(n_epochs=20)-
make_sampler
(sampler_cls, *, seed=None, n_workers=1, max_path_length=None, worker_class=<class 'garage.sampler.default_worker.DefaultWorker'>, sampler_args=None, worker_args=None)[source]¶ Construct a Sampler from a Sampler class.
Parameters: - sampler_cls (type) – The type of sampler to construct.
- seed (int) – Seed to use in sampler workers.
- max_path_length (int) – Maximum path length to be sampled by the sampler. Paths longer than this will be truncated.
- n_workers (int) – The number of workers the sampler should use.
- worker_class (type) – Type of worker the sampler should use.
- sampler_args (dict or None) – Additional arguments that should be passed to the sampler.
- worker_args (dict or None) – Additional arguments that should be passed to the worker.
Returns: An instance of the sampler class.
Return type: sampler_cls
-
setup
(algo, env, sampler_cls=None, sampler_args=None, n_workers=1, worker_class=<class 'garage.sampler.default_worker.DefaultWorker'>, worker_args=None)[source]¶ Set up runner and sessions for algorithm and environment.
This method saves algo and env within runner and creates a sampler, and initializes all uninitialized variables in session.
Note
After setup() is called all variables in session should have been initialized. setup() respects existing values in session so policy weights can be loaded before setup().
Parameters: - algo (garage.np.algos.RLAlgorithm) – An algorithm instance.
- env (garage.envs.GarageEnv) – An environement instance.
- sampler_cls (garage.sampler.Sampler) – A sampler class.
- sampler_args (dict) – Arguments to be passed to sampler constructor.
- n_workers (int) – The number of workers the sampler should use.
- worker_class (type) – Type of worker the sampler should use.
- worker_args (dict or None) – Additional arguments that should be passed to the worker.
-
class
MetaEvaluator
(*, test_task_sampler, max_path_length, n_exploration_traj=10, n_test_tasks=None, n_test_rollouts=1, prefix='MetaTest', test_task_names=None, worker_class=<class 'garage.sampler.default_worker.DefaultWorker'>, worker_args=None)[source]¶ Bases:
object
Evaluates Meta-RL algorithms on test environments.
Parameters: - test_task_sampler (garage.experiment.TaskSampler) – Sampler for test tasks. To demonstrate the effectiveness of a meta-learning method, these should be different from the training tasks.
- max_path_length (int) – Maximum path length used for evaluation trajectories.
- n_test_tasks (int or None) – Number of test tasks to sample each time evaluation is performed. Note that tasks are sampled “without replacement”. If None, is set to test_task_sampler.n_tasks.
- n_exploration_traj (int) – Number of trajectories to gather from the exploration policy before requesting the meta algorithm to produce an adapted policy.
- n_test_rollouts (int) – Number of rollouts to use for each adapted policy. The adapted policy should forget previous rollouts when .reset() is called.
- prefix (str) – Prefix to use when logging. Defaults to MetaTest. For example, this results in logging the key ‘MetaTest/SuccessRate’. If not set to MetaTest, it should probably be set to MetaTrain.
- test_task_names (list[str]) – List of task names to test. Should be in an order consistent with the task_id env_info, if that is present.
- worker_class (type) – Type of worker the Sampler should use.
- worker_args (dict or None) – Additional arguments that should be passed to the worker.
-
evaluate
(algo, test_rollouts_per_task=None)[source]¶ Evaluate the Meta-RL algorithm on the test tasks.
Parameters: - algo (garage.np.algos.MetaRLAlgorithm) – The algorithm to evaluate.
- test_rollouts_per_task (int or None) – Number of rollouts per task.
-
class
Snapshotter
(snapshot_dir='/home/docs/checkouts/readthedocs.org/user_builds/garage/checkouts/v2020.06.0/docs/data/local/experiment', snapshot_mode='last', snapshot_gap=1)[source]¶ Bases:
object
Snapshotter snapshots training data.
When training, it saves data to binary files. When resuming, it loads from saved data.
Parameters: - snapshot_dir (str) – Path to save the log and iteration snapshot.
- snapshot_mode (str) – Mode to save the snapshot. Can be either “all” (all iterations will be saved), “last” (only the last iteration will be saved), “gap” (every snapshot_gap iterations are saved), or “none” (do not save snapshots).
- snapshot_gap (int) – Gap between snapshot iterations. Wait this number of iterations before taking another snapshot.
-
load
(load_dir, itr='last')[source]¶ Load one snapshot of parameters from disk.
Parameters: Returns: Loaded snapshot.
Return type: Raises: ValueError
– If itr is neither an integer nor one of (“last”, “first”).FileNotFoundError
– If the snapshot file is not found in load_dir.NotAFileError
– If the snapshot exists but is not a file.
-
save_itr_params
(itr, params)[source]¶ Save the parameters if at the right iteration.
Parameters: - itr (int) – Number of iterations. Used as the index of snapshot.
- params (obj) – Content of snapshot to be saved.
Raises: ValueError
– If snapshot_mode is not one of “all”, “last” or “gap”.
-
class
SnapshotConfig
(snapshot_dir, snapshot_mode, snapshot_gap)¶ Bases:
tuple
-
snapshot_dir
¶ Alias for field number 0
-
snapshot_gap
¶ Alias for field number 2
-
snapshot_mode
¶ Alias for field number 1
-
-
class
TaskSampler
[source]¶ Bases:
abc.ABC
Class for sampling batches of tasks, represented as `~EnvUpdate`s.
Submodules¶
- garage.experiment.deterministic module
- garage.experiment.experiment module
- garage.experiment.experiment_wrapper module
- garage.experiment.local_runner module
- garage.experiment.local_tf_runner module
- garage.experiment.meta_evaluator module
- garage.experiment.snapshotter module
- garage.experiment.task_sampler module