Behavioral Cloning (BC)

Paper

Model-Free Imitation Learning with Policy Optimization [1]

Framework(s)

../_images/pytorch.png

PyTorch

API Reference

garage.torch.algos.BC

Code

garage/torch/algos/bc.py

Examples

bc_point, bc_point_deterministic_policy

Behavioral cloning is a simple immitation learning algorithm which maxmizes the likelhood of an expert demonstration’s actions under the apprentice policy using direct policy optimization. Garage’s implementation may use either a policy or dataset as the expert.

Default Parameters

policy_optimizer = torch.optim.Adam
policy_lr = 1e-3
loss = 'log_prob'
batch_size = 1000

Examples

bc_point

../_images/pytorch.png

bc_point_deterministic_policy

Experiment Results

../_images/bc_meanLoss.pngBC Mean Loss ../_images/bc_stdLoss.pngBC Mean Loss

../_images/pytorch.png

References

1

Jonathan Ho, Jayesh Gupta, and Stefano Ermon. Model-free imitation learning with policy optimization. In International Conference on Machine Learning, 2760–2769. 2016. URL: https://arxiv.org/abs/1605.08478.


This page was authored by Iris Liu (@irisliucy) with contributions from Ryan Julian (@ryanjulian).