garage.experiment.local_runner module¶
Provides algorithms with access to most of garage’s features.
-
class
LocalRunner
(snapshot_config, max_cpus=1)[source]¶ Bases:
object
Base class of local runner.
Use Runner.setup(algo, env) to setup algorithm and environement for runner and Runner.train() to start training.
Parameters: - snapshot_config (garage.experiment.SnapshotConfig) – The snapshot configuration used by LocalRunner to create the snapshotter. If None, it will create one with default settings.
- max_cpus (int) – The maximum number of parallel sampler workers.
Note
For the use of any TensorFlow environments, policies and algorithms, please use LocalTFRunner().
Examples
# to trainrunner = LocalRunner()env = Env(…)policy = Policy(…)algo = Algo(env=env,policy=policy,…)runner.setup(algo, env)runner.train(n_epochs=100, batch_size=4000)# to resume immediately.runner = LocalRunner()runner.restore(resume_from_dir)runner.resume()# to resume with modified training arguments.runner = LocalRunner()runner.restore(resume_from_dir)runner.resume(n_epochs=20)-
log_diagnostics
(pause_for_plot=False)[source]¶ Log diagnostics.
Parameters: pause_for_plot (bool) – Pause for plot.
-
obtain_samples
(itr, batch_size=None)[source]¶ Obtain one batch of samples.
Parameters: Returns: One batch of samples.
-
restore
(from_dir, from_epoch='last')[source]¶ Restore experiment from snapshot.
Parameters: Returns: A SimpleNamespace for train()’s arguments.
-
resume
(n_epochs=None, batch_size=None, n_epoch_cycles=None, plot=None, store_paths=None, pause_for_plot=None)[source]¶ Resume from restored experiment.
This method provides the same interface as train().
If not specified, an argument will default to the saved arguments from the last call to train().
Returns: The average return in last epoch cycle.
-
setup
(algo, env, sampler_cls=None, sampler_args=None)[source]¶ Set up runner for algorithm and environment.
This method saves algo and env within runner and creates a sampler.
Note
After setup() is called all variables in session should have been initialized. setup() respects existing values in session so policy weights can be loaded before setup().
Parameters: - algo (garage.np.algos.RLAlgorithm) – An algorithm instance.
- env (garage.envs.GarageEnv) – An environement instance.
- sampler_cls (garage.sampler.Sampler) – A sampler class.
- sampler_args (dict) – Arguments to be passed to sampler constructor.
-
step_epochs
()[source]¶ Step through each epoch.
This function returns a magic generator. When iterated through, this generator automatically performs services such as snapshotting and log management. It is used inside train() in each algorithm.
The generator initializes two variables: self.step_itr and self.step_path. To use the generator, these two have to be updated manually in each epoch, as the example shows below.
Yields: int – The next training epoch. Examples
- for epoch in runner.step_epochs():
- runner.step_path = runner.obtain_samples(…) self.train_once(…) runner.step_itr += 1
-
train
(n_epochs, batch_size, n_epoch_cycles=1, plot=False, store_paths=False, pause_for_plot=False)[source]¶ Start training.
Parameters: - n_epochs (int) – Number of epochs.
- batch_size (int) – Number of environment steps in one batch.
- n_epoch_cycles (int) – Number of batches of samples in each epoch. This is only useful for off-policy algorithm. For on-policy algorithm this value should always be 1.
- plot (bool) – Visualize policy by doing rollout after each epoch.
- store_paths (bool) – Save paths in snapshot.
- pause_for_plot (bool) – Pause for plot.
Returns: The average return in last epoch cycle.