This is one of my Udacity Deep Reinforcement Learning Nanodegree projects. Most codes are referred to what I've learned from the course.
For this project, we will train an agent to navigate and collect bananas!
A reward of +1 is given if a yellow banana is collected, and a reward of -1 if a blue banana is. Our goal is to collect as many yellow bananas as possible and to avoid blue bananas.
The state space is consisted of 37 dimensions, agent's velocity, , directions, and so on. The agent has four discrete actions, 0
-move forward, 1
-move backward, 2
-turn left, and 3
-turn right.
The task is episodic. The goal is to train the agent to get an average score of +13 over 100 consecutive episodes.
-
Create (and activate) a new environment with Python 3.6.
- Linux or Mac:
conda create --name env python=3.6 source activate drlnd
- Windows:
conda create --name env python=3.6 activate drlnd
- Linux or Mac:
-
Clone this repository and navigate to the
python/
folder. Then, install several dependencies.git clone https://github.com/nithiroj/DQN-Navigation.git cd DQN-Navigation/python pip install .
-
Download the environment that matches your operating system, place it in
DQN-Navigation/
folder, and unzip it.- Linux: click here
- Mac OSX: click here (Banana.app already in this repository)
- Windows (32-bit): click here
- Windows (64-bit): click here
In this repository, there are four trained algorithms-basic or vanila DQN (basic.pth
), double DQN(double.pth
), dueling DQN(dueling.pth
), or both double and dueling DQN(double_dueling.pth
). The trained models and results (graph and scores) have been provided in model/
and report/
respectively.
You can retrain or train these models with your own hyperparameters. Please be notices that our provided models and results will be replaced accordingly. Just make the copies if you want to keep ours.
Follow the instructions in Navigation.ipynb
to get started with training your own agent!
Run command line on your terminal. Define --env_file
if it's not Banana.app
.
python navigation.py # to train with basic DQN model
python navigation.py --double # to train with double DQN model
python navigation.py --dueling # to train with dueling DQN model
python navigation.py --double --dueling # both double and dueling DQN model
optional arguments:
-h, --help show this help message and exit
--play_eps PLAY_EPS Train if 0 else play episodes, default 0
--env_file ENV_FILE Unity environment binary file, default Banana.app
--dueling Enable dueling DQN
--double Enable double DQN
To watch how the agent performs, define the number of playing episodes --play_eps
and the agent you prefer. For eaxmple, to watch a double-DQN agent play for two episodes:
python navigation.py --double --play_eps 2
You can find more detais in implementation-algorithms (DQN and extensions), model architectures,and choosen hyperparameters-and the achieved rewards in Report.ipynb.