Cross Entropy Method (CEM)

Paper

The cross-entropy method: A unified approach to Monte Carlo simulation, randomized optimization and machine learning [2]

Framework(s)

../_images/numpy.png

API Reference

garage.np.algos.CEM

Code

garage/np/algos/cem.py

Cross Entropy Method (CEM) works by iteratively optimizing a gaussian distribution of policy.

In each epoch, CEM does the following:

  1. Sample n_samples policies from a gaussian distribution of mean cur_mean and std cur_std.

  2. Collect episodes for each policy.

  3. Update cur_mean and cur_std by doing Maximum Likelihood Estimation over the n_best top policies in terms of return.

Examples

NumPy

References

1

Reuven Y Rubinstein and Dirk P Kroese. The cross-entropy method: a unified approach to monte carlo simulation, randomized optimization and machine learning. Information Science & Statistics, Springer Verlag, NY, 2004.


This page was authored by Ruofu Wang (@yeukfu).