garage.experiment.local_runner module¶
Provides algorithms with access to most of garage’s features.
-
class
ExperimentStats
(total_epoch, total_itr, total_env_steps, last_path)[source]¶ Bases:
object
Statistics of a experiment.
Parameters:
-
class
LocalRunner
(snapshot_config, max_cpus=1)[source]¶ Bases:
object
Base class of local runner.
Use Runner.setup(algo, env) to setup algorithm and environement for runner and Runner.train() to start training.
Parameters: - snapshot_config (garage.experiment.SnapshotConfig) – The snapshot configuration used by LocalRunner to create the snapshotter. If None, it will create one with default settings.
- max_cpus (int) – The maximum number of parallel sampler workers.
Note
For the use of any TensorFlow environments, policies and algorithms, please use LocalTFRunner().
Examples
# to trainrunner = LocalRunner()env = Env(…)policy = Policy(…)algo = Algo(env=env,policy=policy,…)runner.setup(algo, env)runner.train(n_epochs=100, batch_size=4000)# to resume immediately.runner = LocalRunner()runner.restore(resume_from_dir)runner.resume()# to resume with modified training arguments.runner = LocalRunner()runner.restore(resume_from_dir)runner.resume(n_epochs=20)-
get_env_copy
()[source]¶ Get a copy of the environment.
Returns: An environement instance. Return type: garage.envs.GarageEnv
-
log_diagnostics
(pause_for_plot=False)[source]¶ Log diagnostics.
Parameters: pause_for_plot (bool) – Pause for plot.
-
make_sampler
(sampler_cls, *, seed=None, n_workers=1, max_path_length=None, worker_class=<class 'garage.sampler.default_worker.DefaultWorker'>, sampler_args=None, worker_args=None)[source]¶ Construct a Sampler from a Sampler class.
Parameters: - sampler_cls (type) – The type of sampler to construct.
- seed (int) – Seed to use in sampler workers.
- max_path_length (int) – Maximum path length to be sampled by the sampler. Paths longer than this will be truncated.
- n_workers (int) – The number of workers the sampler should use.
- worker_class (type) – Type of worker the Sampler should use.
- sampler_args (dict or None) – Additional arguments that should be passed to the sampler.
- worker_args (dict or None) – Additional arguments that should be passed to the sampler.
Raises: ValueError
– If max_path_length isn’t passed and the algorithm doesn’t contain a max_path_length field, or if the algorithm doesn’t have a policy field.Returns: An instance of the sampler class.
Return type: sampler_cls
-
obtain_samples
(itr, batch_size=None, agent_update=None, env_update=None)[source]¶ Obtain one batch of samples.
Parameters: - itr (int) – Index of iteration (epoch).
- batch_size (int) – Number of steps in batch. This is a hint that the sampler may or may not respect.
- agent_update (object) – Value which will be passed into the agent_update_fn before doing rollouts. If a list is passed in, it must have length exactly factory.n_workers, and will be spread across the workers.
- env_update (object) – Value which will be passed into the env_update_fn before doing rollouts. If a list is passed in, it must have length exactly factory.n_workers, and will be spread across the workers.
Raises: ValueError
– Raised if the runner was initialized without a sampler, or batch_size wasn’t provided here or to train.Returns: One batch of samples.
Return type:
-
restore
(from_dir, from_epoch='last')[source]¶ Restore experiment from snapshot.
Parameters: Returns: Arguments for train().
Return type:
-
resume
(n_epochs=None, batch_size=None, plot=None, store_paths=None, pause_for_plot=None)[source]¶ Resume from restored experiment.
This method provides the same interface as train().
If not specified, an argument will default to the saved arguments from the last call to train().
Parameters: Raises: NotSetupError
– If resume() is called before restore().Returns: The average return in last epoch cycle.
Return type:
-
save
(epoch)[source]¶ Save snapshot of current batch.
Parameters: epoch (int) – Epoch. Raises: NotSetupError
– if save() is called before the runner is set up.
-
setup
(algo, env, sampler_cls=None, sampler_args=None, n_workers=1, worker_class=<class 'garage.sampler.default_worker.DefaultWorker'>, worker_args=None)[source]¶ Set up runner for algorithm and environment.
This method saves algo and env within runner and creates a sampler.
Note
After setup() is called all variables in session should have been initialized. setup() respects existing values in session so policy weights can be loaded before setup().
Parameters: - algo (garage.np.algos.RLAlgorithm) – An algorithm instance.
- env (garage.envs.GarageEnv) – An environement instance.
- sampler_cls (garage.sampler.Sampler) – A sampler class.
- sampler_args (dict) – Arguments to be passed to sampler constructor.
- n_workers (int) – The number of workers the sampler should use.
- worker_class (type) – Type of worker the sampler should use.
- worker_args (dict or None) – Additional arguments that should be passed to the worker.
Raises: ValueError
– If sampler_cls is passed and the algorithm doesn’t contain a max_path_length field.
-
step_epochs
()[source]¶ Step through each epoch.
This function returns a magic generator. When iterated through, this generator automatically performs services such as snapshotting and log management. It is used inside train() in each algorithm.
The generator initializes two variables: self.step_itr and self.step_path. To use the generator, these two have to be updated manually in each epoch, as the example shows below.
Yields: int – The next training epoch. Examples
- for epoch in runner.step_epochs():
- runner.step_path = runner.obtain_samples(…) self.train_once(…) runner.step_itr += 1
-
total_env_steps
¶ Total environment steps collected.
Returns: Total environment steps collected. Return type: int
-
train
(n_epochs, batch_size=None, plot=False, store_paths=False, pause_for_plot=False)[source]¶ Start training.
Parameters: Raises: NotSetupError
– If train() is called before setup().Returns: The average return in last epoch cycle.
Return type:
-
exception
NotSetupError
[source]¶ Bases:
Exception
Raise when an experiment is about to run without setup.
-
class
SetupArgs
(sampler_cls, sampler_args, seed)[source]¶ Bases:
object
Arguments to setup a runner.
Parameters: - sampler_cls (garage.sampler.Sampler) – A sampler class.
- sampler_args (dict) – Arguments to be passed to sampler constructor.
- seed (int) – Random seed.
-
class
TrainArgs
(n_epochs, batch_size, plot, store_paths, pause_for_plot, start_epoch)[source]¶ Bases:
object
Arguments to call train() or resume().
Parameters: - n_epochs (int) – Number of epochs.
- batch_size (int) – Number of environment steps in one batch.
- plot (bool) – Visualize policy by doing rollout after each epoch.
- store_paths (bool) – Save paths in snapshot.
- pause_for_plot (bool) – Pause for plot.
- start_epoch (int) – The starting epoch. Used for resume().