ppo_pendulum
¶
This is an example to train a task with PPO algorithm.
Here it creates InvertedDoublePendulum using gym. And uses a PPO with 1M steps.
- Results:
AverageDiscountedReturn: 500 RiseTime: itr 40
ppo_pendulum
¶This is an example to train a task with PPO algorithm.
Here it creates InvertedDoublePendulum using gym. And uses a PPO with 1M steps.
AverageDiscountedReturn: 500 RiseTime: itr 40