cma_es_cartpole
¶
This is an example to train a task with CMA-ES.
Here it runs CartPole-v1 environment with 100 epoches.
- Results:
AverageReturn: 100 RiseTime: epoch 38 (itr 760),
but regression is observed in the course of training.