Pong

Trains a Pong agent using policy gradients on OpenAI's gym. This code was copied from Andrej Karpathy's Deep Reinforcement Learning: Pong from Pixels, and almost all changes to the code were for cosmetic purposes. Please refere to Karpathy's walkthrough to learn more about the implementation!

Usage

python3 pong.py

Set resume = True in pong.py if you want to continue training the agent where it was left off in model.p, otherwise set resume = False to start the agent training from scratch.

Output

Resuming model 'model.p'...
ep 0: game finished, reward: 1.000000
ep 0: game finished, reward: 1.000000
ep 0: game finished, reward: 1.000000
ep 0: game finished, reward: -1.000000
ep 0: game finished, reward: 1.000000
ep 0: game finished, reward: -1.000000
...
...
...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Pong

Usage

Output

Files

README.md

Latest commit

History

README.md

File metadata and controls

Pong

Usage

Output