Episodic Reward Weighted Regression (ERWR)¶

Papers	Using Reward-weighted Regression for Reinforcement Learning of Task Space Control [1] Policy Search for Motor Primitives in Robotics [2]
Framework(s)	Tensorflow¶
API Reference	garage.tf.algos.ERWR
Code	garage/tf/algos/erwr.py
Examples	erwr_cartpole

Episodic Reward Weighted Regression (ERWR) is an extension of the original RWR algorithm, which uses a linear policy to solve the immediate rewards learning problem. The extension implemented here applies RWR to episodic reinforcement learning. To read more about both algorithms see the cited papers or the summary provided in this text.

Default Parameters¶

scope=None
discount=0.99
gae_lambda=1
center_adv=True
positive_adv=True
fixed_horizon=False
lr_clip_range=0.01
max_kl_step=0.01
optimizer=None
optimizer_args=None
policy_ent_coeff=0.0
use_softplus_entropy=False
use_neg_logli_entropy=False
stop_entropy_gradient=False
entropy_method='no_entropy'
name='ERWR'

Examples¶

erwr_cartpole¶

References¶

1: J. Peters and S. Schaal. Using reward-weighted regression for reinforcement learning of task space control. In 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning, volume, 262–267. 2007.
2: J. Kober and J. Peters. Policy search for motor primitives in robotics. Advances in neural information processing systems 21 : 22nd Annual Conference on Neural Information Processing Systems 2008, pages 849–856, June 2009.

This page was authored by Mishari Aliesa (@maliesa96).