trpo_swimmer_ray_sampler
¶
This is an example to train a task with TRPO algorithm.
Uses Ray sampler instead of on_policy vectorized sampler. Here it runs Swimmer-v2 environment with 40 iterations.
trpo_swimmer_ray_sampler
¶This is an example to train a task with TRPO algorithm.
Uses Ray sampler instead of on_policy vectorized sampler. Here it runs Swimmer-v2 environment with 40 iterations.