Episodic Reward Weighted Regression (ERWR)

Papers

Using Reward-weighted Regression for Reinforcement Learning of Task Space Control [1]

Policy Search for Motor Primitives in Robotics [2]

Framework(s)

../_images/tf.png

Tensorflow

API Reference

garage.tf.algos.ERWR

Code

garage/tf/algos/erwr.py

Examples

erwr_cartpole

Episodic Reward Weighted Regression (ERWR) is an extension of the original RWR algorithm, which uses a linear policy to solve the immediate rewards learning problem. The extension implemented here applies RWR to episodic reinforcement learning. To read more about both algorithms see the cited papers or the summary provided in this text.

Default Parameters

scope=None
discount=0.99
gae_lambda=1
center_adv=True
positive_adv=True
fixed_horizon=False
lr_clip_range=0.01
max_kl_step=0.01
optimizer=None
optimizer_args=None
policy_ent_coeff=0.0
use_softplus_entropy=False
use_neg_logli_entropy=False
stop_entropy_gradient=False
entropy_method='no_entropy'
name='ERWR'

Examples

erwr_cartpole

../_images/tf.png

References

1

J. Peters and S. Schaal. Using reward-weighted regression for reinforcement learning of task space control. In 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning, volume, 262–267. 2007.

2

J. Kober and J. Peters. Policy search for motor primitives in robotics. Advances in neural information processing systems 21 : 22nd Annual Conference on Neural Information Processing Systems 2008, pages 849–856, June 2009.


This page was authored by Mishari Aliesa (@maliesa96).