garage.experiment package

Experiment functions.

run_experiment(method_call=None, batch_tasks=None, exp_prefix='experiment', exp_name=None, log_dir=None, script='garage.experiment.experiment_wrapper', python_command='python', dry=False, env=None, variant=None, force_cpu=False, pre_commands=None, **kwargs)[source]

Serialize the method call and run the experiment using the specified mode.

Parameters:
  • method_call (callable) – A method call.
  • batch_tasks (list[dict]) – A batch of method calls.
  • exp_prefix (str) – Name prefix for the experiment.
  • exp_name (str) – Name of the experiment.
  • log_dir (str) – Log directory for the experiment.
  • script (str) – The name of the entrance point python script.
  • python_command (str) – Python command to run the experiment.
  • dry (bool) – Whether to do a dry-run, which only prints the commands without executing them.
  • env (dict) – Extra environment variables.
  • variant (dict) – If provided, should be a dictionary of parameters.
  • force_cpu (bool) – Whether to set all GPU devices invisible to force use CPU.
  • pre_commands (str) – Pre commands to run the experiment.
to_local_command(params, python_command='python', script='garage.experiment.experiment_wrapper')[source]
variant(*args, **kwargs)[source]
class VariantGenerator[source]

Bases: object

Usage:

vg = VariantGenerator()
vg.add(“param1”, [1, 2, 3])
vg.add(“param2”, [‘x’, ‘y’])
vg.variants() => # all combinations of [1,2,3] x [‘x’,’y’]

Supports noncyclic dependency among parameters: | vg = VariantGenerator() | vg.add(“param1”, [1, 2, 3]) | vg.add(“param2”, lambda param1: [param1+1, param1+2]) | vg.variants() => # ..

add(key, vals, **kwargs)[source]
ivariants()[source]
to_name_suffix(variant)[source]
variant_dict(variant)[source]
variants(randomized=False)[source]
class LocalRunner(snapshot_config, max_cpus=1)[source]

Bases: object

Base class of local runner.

Use Runner.setup(algo, env) to setup algorithm and environement for runner and Runner.train() to start training.

Parameters:
  • snapshot_config (garage.experiment.SnapshotConfig) – The snapshot configuration used by LocalRunner to create the snapshotter. If None, it will create one with default settings.
  • max_cpus (int) – The maximum number of parallel sampler workers.

Note

For the use of any TensorFlow environments, policies and algorithms, please use LocalTFRunner().

Examples

# to train
runner = LocalRunner()
env = Env(…)
policy = Policy(…)
algo = Algo(
env=env,
policy=policy,
…)
runner.setup(algo, env)
runner.train(n_epochs=100, batch_size=4000)
# to resume immediately.
runner = LocalRunner()
runner.restore(resume_from_dir)
runner.resume()
# to resume with modified training arguments.
runner = LocalRunner()
runner.restore(resume_from_dir)
runner.resume(n_epochs=20)
log_diagnostics(pause_for_plot=False)[source]

Log diagnostics.

Parameters:pause_for_plot (bool) – Pause for plot.
obtain_samples(itr, batch_size=None)[source]

Obtain one batch of samples.

Parameters:
  • itr (int) – Index of iteration (epoch).
  • batch_size (int) – Number of steps in batch. This is a hint that the sampler may or may not respect.
Returns:

One batch of samples.

restore(from_dir, from_epoch='last')[source]

Restore experiment from snapshot.

Parameters:
  • from_dir (str) – Directory of the pickle file to resume experiment from.
  • from_epoch (str or int) – The epoch to restore from. Can be ‘first’, ‘last’ or a number. Not applicable when snapshot_mode=’last’.
Returns:

A SimpleNamespace for train()’s arguments.

resume(n_epochs=None, batch_size=None, n_epoch_cycles=None, plot=None, store_paths=None, pause_for_plot=None)[source]

Resume from restored experiment.

This method provides the same interface as train().

If not specified, an argument will default to the saved arguments from the last call to train().

Returns:The average return in last epoch cycle.
save(epoch, paths=None)[source]

Save snapshot of current batch.

Parameters:
  • itr (int) – Index of iteration (epoch).
  • paths (dict) – Batch of samples after preprocessed. If None, no paths will be logged to the snapshot.
setup(algo, env, sampler_cls=None, sampler_args=None)[source]

Set up runner for algorithm and environment.

This method saves algo and env within runner and creates a sampler.

Note

After setup() is called all variables in session should have been initialized. setup() respects existing values in session so policy weights can be loaded before setup().

Parameters:
  • algo (garage.np.algos.RLAlgorithm) – An algorithm instance.
  • env (garage.envs.GarageEnv) – An environement instance.
  • sampler_cls (garage.sampler.Sampler) – A sampler class.
  • sampler_args (dict) – Arguments to be passed to sampler constructor.
step_epochs()[source]

Step through each epoch.

This function returns a magic generator. When iterated through, this generator automatically performs services such as snapshotting and log management. It is used inside train() in each algorithm.

The generator initializes two variables: self.step_itr and self.step_path. To use the generator, these two have to be updated manually in each epoch, as the example shows below.

Yields:int – The next training epoch.

Examples

for epoch in runner.step_epochs():
runner.step_path = runner.obtain_samples(…) self.train_once(…) runner.step_itr += 1
train(n_epochs, batch_size, n_epoch_cycles=1, plot=False, store_paths=False, pause_for_plot=False)[source]

Start training.

Parameters:
  • n_epochs (int) – Number of epochs.
  • batch_size (int) – Number of environment steps in one batch.
  • n_epoch_cycles (int) – Number of batches of samples in each epoch. This is only useful for off-policy algorithm. For on-policy algorithm this value should always be 1.
  • plot (bool) – Visualize policy by doing rollout after each epoch.
  • store_paths (bool) – Save paths in snapshot.
  • pause_for_plot (bool) – Pause for plot.
Returns:

The average return in last epoch cycle.

class Snapshotter(snapshot_dir='/home/docs/checkouts/readthedocs.org/user_builds/garage/checkouts/v2019.10.1/docs/data/local/experiment', snapshot_mode='last', snapshot_gap=1)[source]

Bases: object

Snapshotter snapshots training data.

When training, it saves data to binary files. When resuming, it loads from saved data.

Parameters:
  • snapshot_dir (str) – Path to save the log and iteration snapshot.
  • snapshot_mode (str) – Mode to save the snapshot. Can be either “all” (all iterations will be saved), “last” (only the last iteration will be saved), “gap” (every snapshot_gap iterations are saved), or “none” (do not save snapshots).
  • snapshot_gap (int) – Gap between snapshot iterations. Wait this number of iterations before taking another snapshot.
load(load_dir, itr='last')[source]

Load one snapshot of parameters from disk.

Parameters:
  • load_dir (str) – Directory of the pickle file to resume experiment from.
  • itr (int or string) – Iteration to load. Can be an integer, ‘last’ or ‘first’.
Returns:

Loaded snapshot

Return type:

dict

save_itr_params(itr, params)[source]

Save the parameters if at the right iteration.

snapshot_dir

Return the directory of snapshot.

snapshot_gap

Return the wait number of snapshot.

snapshot_mode

Return the type of snapshot.

class SnapshotConfig(snapshot_dir, snapshot_mode, snapshot_gap)

Bases: tuple

snapshot_dir

Alias for field number 0

snapshot_gap

Alias for field number 2

snapshot_mode

Alias for field number 1