Quick Start with garage¶

Table of Content

What is garage?
Why garage?
Kick Start garage
Open Source Support
Resources

What is garage?¶

garage is a reinforcement learning (RL) toolkit for developing and evaluating algorithms. The garage library also provides a collection of state-of-the-art implementations of RL algorithms.

The toolkit provides a wide range of modular tools for implementing RL algorithms, including:

Composable neural network models
Replay buffers
High-performance samplers
An expressive experiment definition interface
Tools for reproducibility (e.g. set a global random seed which all components respect)
Logging to many outputs, including TensorBoard
Reliable experiment checkpointing and resuming
Environment interfaces for many popular benchmark suites
Supporting for running garage in diverse environments, including always up-to-date Docker containers

Why garage?¶

garage aims to provide both researchers and developers:

a flexible and structured tool for developing algorithms to solve a variety of RL problems,
a standardized and reproducible environment for experimenting and evaluating RL algorithms,
a collection of benchmarks and examples of RL algorithms.

Kick Start garage¶

This quickstart will show how to quickly get started with garage in 5 minutes.

import garage

Algorithms¶

An array of algorithms are available in garage:

Algorithm	Framework(s)
CEM	Numpy
CMA-ES	Numpy
REINFORCE (a.k.a. VPG)	PyTorch, TensorFlow
DDPG	PyTorch, TensorFlow
DQN	TensorFlow
DDQN	TensorFlow
ERWR	TensorFlow
NPO	TensorFlow
PPO	PyTorch, TensorFlow
REPS	TensorFlow
TD3	TensorFlow
TNPG	TensorFlow
TRPO	PyTorch, TensorFlow
MAML	PyTorch
RL2	TensorFlow
PEARL	PyTorch
SAC	PyTorch
MTSAC	PyTorch
MTPPO	PyTorch, TensorFlow
MTTRPO	PyTorch, TensorFlow
Task Embedding	TensorFlow
Behavioral Cloning	PyTorch

They are organized in the github repository as:

└── garage
    ├── envs
    ├── experiment
    ├── misc
    ├── np
    ├── plotter
    ├── replay_buffer
    ├── sampler
    ├── tf
    └── torch

Note: clickable links represents the directory of algorithms.

A simple pytorch example to import TRPO algorithm, as well as, the policy GaussianMLPPolicy, value function GaussianMLPValueFunction and sampler LocalSampler in garage is shown below:

import gym
import torch

from garage.envs import GarageEnv, normalize
from garage.sampler import LocalSampler
from garage.torch.algos import TRPO as PyTorch_TRPO
from garage.torch.policies import GaussianMLPPolicy as PyTorch_GMP
from garage.torch.value_functions import GaussianMLPValueFunction

def trpo_garage_pytorch():

    env = GarageEnv(normalize(gym.make(env_id))) # specify env_id

    policy = PyTorch_GMP(env.spec,
                         hidden_sizes= [32, 32],
                         hidden_nonlinearity=torch.tanh,
                         output_nonlinearity=None)

    value_function = GaussianMLPValueFunction(env_spec=env.spec,
                                              hidden_sizes=(32, 32),
                                              hidden_nonlinearity=torch.tanh,
                                              output_nonlinearity=None)

    sampler = LocalSampler(agents=policy,
                           envs=env,
                           max_episode_length=env.spec.max_episode_length)

    algo = PyTorch_TRPO(
        env_spec=env.spec,
        policy=policy,
        value_function=value_function,
        sampler=sampler,
        discount=0.99,
        gae_lambda=0.97)

The full code can be found here.

To know more about implementing new algorithms, see this guide

Running Examples¶

Garage ships with example files to help you get started. To get a list of examples, run:

garage examples

This prints a list of examples along with their fully qualified name, such as:

tf/dqn_cartpole.py (garage.examples.tf.dqn_cartpole.py)

To get the source of an example, run:

garage examples tf/dqn_cartpole.py

This will print the source on your console, which you can write to a file as follows:

garage examples tf/dqn_cartpole.py > tf_dqn_cartpole.py

You can also directly run an example by passing the fully qualified name to python -m, as follows:

python -m garage.examples.tf.dqn_cartpole.py

You can also access the examples for a specific version on GitHub by visiting the tag corresponding to that version and then navigating to src/garage/examples.

Running Experiments¶

In garage, experiments are run using the “experiment launcher” wrap_experiment, a decorated Python function, which can be imported directly from the garage package.

from garage import wrap_experiment

Moreover, objects, such as trainer, environment, policy, sampler e.t.c are commonly used when constructing experiments in garage.

"""A regression test for automatic benchmarking garage-PyTorch-TRPO."""
import torch

from garage import wrap_experiment
from garage.envs import GymEnv, normalize
from garage.experiment import deterministic
from garage.sampler import LocalSampler
from garage.torch.algos import TRPO as PyTorch_TRPO
from garage.torch.policies import GaussianMLPPolicy as PyTorch_GMP
from garage.torch.value_functions import GaussianMLPValueFunction
from garage.trainer import Trainer

hyper_parameters = {
    'hidden_sizes': [32, 32],
    'max_kl': 0.01,
    'gae_lambda': 0.97,
    'discount': 0.99,
    'n_epochs': 999,
    'batch_size': 1024,
}


@wrap_experiment
def trpo_garage_pytorch(ctxt, env_id, seed):
    """Create garage PyTorch TRPO model and training.

    Args:
        ctxt (garage.experiment.ExperimentContext): The experiment
                configuration used by Trainer to create the
                snapshotter.
        env_id (str): Environment id of the task.
        seed (int): Random positive integer for the trial.

    """
    deterministic.set_seed(seed)

    trainer = Trainer(ctxt)

    env = normalize(GymEnv(env_id))

    policy = PyTorch_GMP(env.spec,
                         hidden_sizes=hyper_parameters['hidden_sizes'],
                         hidden_nonlinearity=torch.tanh,
                         output_nonlinearity=None)

    value_function = GaussianMLPValueFunction(env_spec=env.spec,
                                              hidden_sizes=(32, 32),
                                              hidden_nonlinearity=torch.tanh,
                                              output_nonlinearity=None)

    sampler = LocalSampler(agents=policy,
                           envs=env,
                           max_episode_length=env.spec.max_episode_length)

    algo = PyTorch_TRPO(env_spec=env.spec,
                        policy=policy,
                        value_function=value_function,
                        sampler=sampler,
                        discount=hyper_parameters['discount'],
                        gae_lambda=hyper_parameters['gae_lambda'])

    trainer.setup(algo, env)
    trainer.train(n_epochs=hyper_parameters['n_epochs'],
                  batch_size=hyper_parameters['batch_size'])

This page will give you more insight into running experiments.

Plotting results¶

In garage, we use TensorBoard for plotting experiment results.

This guide will provide details how to set up tensorboard when running experiments in garage.

Experiment outputs¶

Localrunner is a state manager of experiments in garage, It is set up to create, save and restore the state, also known as snapshot object, upon/ during an experiment. The snapshot object includes the hyperparameter configuration, training progress, a pickled object of algorithm(s) and environment(s), tensorboard event file etc.

Experiment results will, by default, output to the same directory as the garage package in the relative directory data/local/experiment. The output directory is generally organized as the following:

└── data
    └── local
        └── experiment
            └── your_experiment_name
                ├── progress.csv
                ├── debug.log
                ├── variant.json
                ├── metadata.json
                ├── launch_archive.tar.xz
                └── events.out.tfevents.xxx

wrap_experiment can be invoked with arguments to support actions like modifying default output directory, changing snapshot modes, controlling snapshot gap etc. For example, to modify the default output directory and change the snapshot mode from last (only last iteration will be saved) to all, we can do this:

@wrap_experiment(log_dir='./your_log_dir', snapshot_mode='all')
        def my_experiment(ctxt, seed, lr=0.5):
            ...

During an experiment, garage extensively use logger from Dowel for logging outputs to StdOutput, and/ or TextOutput, and/or CsvOutput. For details, you can check this.

Open Source Support¶

Since October 2018, garage is active in the open-source community contributing to RL researches and developments. Any contributions from the community is more than welcomed.

Resources¶

If you are interested in a more in-depth and specific capabilities of garage, you can find many other guides in this website such as, but not limited to, the followings:

This page was authored by Iris Liu (@irisliucy).