garage.experiment package¶

Experiment functions.

run_experiment(method_call=None, batch_tasks=None, exp_prefix='experiment', exp_name=None, log_dir=None, script='garage.experiment.experiment_wrapper', python_command='python', dry=False, env=None, variant=None, force_cpu=False, pre_commands=None, **kwargs)[source]¶

Serialize the method call and run the experiment using the specified mode.

Parameters:

method_call (callable) – A method call.
batch_tasks (list[dict]) – A batch of method calls.
exp_prefix (str) – Name prefix for the experiment.
exp_name (str) – Name of the experiment.
log_dir (str) – Log directory for the experiment.
script (str) – The name of the entrance point python script.
python_command (str) – Python command to run the experiment.
dry (bool) – Whether to do a dry-run, which only prints the commands without executing them.
env (dict) – Extra environment variables.
variant (dict) – If provided, should be a dictionary of parameters.
force_cpu (bool) – Whether to set all GPU devices invisible to force use CPU.
pre_commands (str) – Pre commands to run the experiment.

to_local_command(params, python_command='python', script='garage.experiment.experiment_wrapper')[source]¶

variant(*args, **kwargs)[source]¶

class VariantGenerator[source]¶

Bases: object

Usage:

vg = VariantGenerator()
vg.add(“param1”, [1, 2, 3])
vg.add(“param2”, [‘x’, ‘y’])
vg.variants() => # all combinations of [1,2,3] x [‘x’,’y’]

Supports noncyclic dependency among parameters: | vg = VariantGenerator() | vg.add(“param1”, [1, 2, 3]) | vg.add(“param2”, lambda param1: [param1+1, param1+2]) | vg.variants() => # ..

add(key, vals, **kwargs)[source]¶

ivariants()[source]¶

to_name_suffix(variant)[source]¶

variant_dict(variant)[source]¶

variants(randomized=False)[source]¶

class LocalRunner(snapshot_config, max_cpus=1)[source]¶

Bases: object

Base class of local runner.

Use Runner.setup(algo, env) to setup algorithm and environement for runner and Runner.train() to start training.

Parameters:	snapshot_config (garage.experiment.SnapshotConfig) – The snapshot configuration used by LocalRunner to create the snapshotter. If None, it will create one with default settings. max_cpus (int) – The maximum number of parallel sampler workers.

Note

For the use of any TensorFlow environments, policies and algorithms, please use LocalTFRunner().

Examples

# to train
runner = LocalRunner()
env = Env(…)
policy = Policy(…)
algo = Algo(
env=env,
policy=policy,
…)
runner.setup(algo, env)
runner.train(n_epochs=100, batch_size=4000)

# to resume immediately.
runner = LocalRunner()
runner.restore(resume_from_dir)
runner.resume()

# to resume with modified training arguments.
runner = LocalRunner()
runner.restore(resume_from_dir)
runner.resume(n_epochs=20)

log_diagnostics(pause_for_plot=False)[source]¶

Log diagnostics.

Parameters:	pause_for_plot (bool) – Pause for plot.

obtain_samples(itr, batch_size=None)[source]¶

Obtain one batch of samples.

Parameters:	itr (int) – Index of iteration (epoch). batch_size (int) – Number of steps in batch. This is a hint that the sampler may or may not respect.
Returns:	One batch of samples.

restore(from_dir, from_epoch='last')[source]¶

Restore experiment from snapshot.

Parameters:	from_dir (str) – Directory of the pickle file to resume experiment from. from_epoch (str or int) – The epoch to restore from. Can be ‘first’, ‘last’ or a number. Not applicable when snapshot_mode=’last’.
Returns:	A SimpleNamespace for train()’s arguments.

resume(n_epochs=None, batch_size=None, n_epoch_cycles=None, plot=None, store_paths=None, pause_for_plot=None)[source]¶

Resume from restored experiment.

This method provides the same interface as train().

If not specified, an argument will default to the saved arguments from the last call to train().

Returns:	The average return in last epoch cycle.

save(epoch, paths=None)[source]¶

Save snapshot of current batch.

Parameters:	itr (int) – Index of iteration (epoch). paths (dict) – Batch of samples after preprocessed. If None, no paths will be logged to the snapshot.

setup(algo, env, sampler_cls=None, sampler_args=None)[source]¶

Set up runner for algorithm and environment.

This method saves algo and env within runner and creates a sampler.

Note

After setup() is called all variables in session should have been initialized. setup() respects existing values in session so policy weights can be loaded before setup().

Parameters:	algo (garage.np.algos.RLAlgorithm) – An algorithm instance. env (garage.envs.GarageEnv) – An environement instance. sampler_cls (garage.sampler.Sampler) – A sampler class. sampler_args (dict) – Arguments to be passed to sampler constructor.

step_epochs()[source]¶

Step through each epoch.

This function returns a magic generator. When iterated through, this generator automatically performs services such as snapshotting and log management. It is used inside train() in each algorithm.

The generator initializes two variables: self.step_itr and self.step_path. To use the generator, these two have to be updated manually in each epoch, as the example shows below.

Yields:	int – The next training epoch.

Examples

for epoch in runner.step_epochs():: runner.step_path = runner.obtain_samples(…) self.train_once(…) runner.step_itr += 1

train(n_epochs, batch_size, n_epoch_cycles=1, plot=False, store_paths=False, pause_for_plot=False)[source]¶

Start training.

Parameters:

n_epochs (int) – Number of epochs.
batch_size (int) – Number of environment steps in one batch.
n_epoch_cycles (int) – Number of batches of samples in each epoch. This is only useful for off-policy algorithm. For on-policy algorithm this value should always be 1.
plot (bool) – Visualize policy by doing rollout after each epoch.
store_paths (bool) – Save paths in snapshot.
pause_for_plot (bool) – Pause for plot.

Returns:

The average return in last epoch cycle.

class Snapshotter(snapshot_dir='/home/docs/checkouts/readthedocs.org/user_builds/garage/checkouts/v2019.10.1/docs/data/local/experiment', snapshot_mode='last', snapshot_gap=1)[source]¶

Bases: object

Snapshotter snapshots training data.

When training, it saves data to binary files. When resuming, it loads from saved data.

Parameters:

snapshot_dir (str) – Path to save the log and iteration snapshot.
snapshot_mode (str) – Mode to save the snapshot. Can be either “all” (all iterations will be saved), “last” (only the last iteration will be saved), “gap” (every snapshot_gap iterations are saved), or “none” (do not save snapshots).
snapshot_gap (int) – Gap between snapshot iterations. Wait this number of iterations before taking another snapshot.

load(load_dir, itr='last')[source]¶

Load one snapshot of parameters from disk.

Parameters:	load_dir (str) – Directory of the pickle file to resume experiment from. itr (int or string) – Iteration to load. Can be an integer, ‘last’ or ‘first’.
Returns:	Loaded snapshot
Return type:	dict

save_itr_params(itr, params)[source]¶: Save the parameters if at the right iteration.

snapshot_dir¶: Return the directory of snapshot.

snapshot_gap¶: Return the wait number of snapshot.

snapshot_mode¶: Return the type of snapshot.

class SnapshotConfig(snapshot_dir, snapshot_mode, snapshot_gap)¶

Bases: tuple

snapshot_dir¶: Alias for field number 0

snapshot_gap¶: Alias for field number 2

snapshot_mode¶: Alias for field number 1

garage.experiment package¶

Submodules¶