Deep Q Networks (DQN)

Paper

Playing Atari with Deep Reinforcement Learning. [1]

Framework(s)

../_images/pytorch.png

PyTorch

../_images/tf.png

TensorFlow

API Reference

garage.torch.algos.DQN

garage.tf.algos.DQN

Code

garage/torch/algos/dqn.py

garage/tf/algos/dqn.py

Examples

examples

Deep Q Networks, or simply DQN, is a staple off-policy method ontop of which many more recent algorithms were developed. It uses a learned Q function to obtain estimates for the values of each state, action pair (S,A), and selects the optimal value by simply taking the argmax of all Q values. The algorithm is most known for learning policies that perform well on a large suite of Atari games.

Supported DQN Variants

As of now, both the Tensorflow and PyTorch implementations support Double DQN and Dueling DQN, as well as both combined. Double DQN can be used by simply passing a flag to the algorthim:

algo = DQN(policy=policy,
           double_q=True,
           ...)

Dueling DQN can be used by passing in a Q function with the dueling architecture to algorithm. When using the Tensorflow branch, a dueling Q function can be constructed as follows:

qf = DiscreteCNNQFunction(env_spec=env.spec,
                          dueling=True,
                          ...)

See the Pong example below for a full launcher file.

In the Pytorch branch, the architecture belongs to its own class:

qf = DiscreteDuelingCNNQFunction(env_spec=env.spec,
                                 ...)

See the Atari launcher below for a full example.

Examples

Both the Tensorflow and PyTorch branches of garage contain several DQN examples.

Pytorch

The PyTorch branch provides an Atari launcher (see below) that allows you to easily run experiments from the command line like so:

# train on PongNoFrameskip-v4 for 10M steps
python examples/torch/dqn_atari.py Pong --seed 22 --n_steps 10e6

by default, all the hyperparameters stored in the hyperparams dict at the top of the launcher file, as well as the ones passed in via the command line, are stored in the variant.json file in the experiment’s directory. You can use this feature to quickly compare hyperparameters between multiple runs.

The training curves for Dueling DQN on the Atari-10M suite are included below.

../_images/dqn_plots.pngDueling DQN on Atari 10M

TF

References

1

Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. Playing atari with deep reinforcement learning. 2013. arXiv:1312.5602.


This page was authored by Mishari Aliesa (@maliesa96).