Introduction to Machine Learning Project 6

Reinforcement learning: Value Iteration, SARSA, and Q-learning on the racetrack problem

Train a policy

To train a Q-learning policy:

python exp_Q.py

To train a SARSA policy:

python exp_SARSA.py

To train a value iteration policy:

python exp_VI.py

This results in an arrays directory which contains the policy at various points during training as well as a

Race with a policy

Once a policy is trained, it can be used in a race.

python race.py

This produces a race like so:

Results

Plot of learning curves for Q-Learning on different tracks and with different crash behavior. A "normal crash" means that if the car crashes into a wall, it returns to the last valid track square, whereas a "bad crash" means that it returns to the starting line upon crashing.

Example race:

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
data		data
img		img
src		src
.gitignore		.gitignore
README.md		README.md
exp_Q.py		exp_Q.py
exp_SARSA.py		exp_SARSA.py
exp_VI.py		exp_VI.py
race.py		race.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction to Machine Learning Project 6

Train a policy

Race with a policy

Results

About

Releases

Packages

Languages

skycarl/ml_proj_6

Folders and files

Latest commit

History

Repository files navigation

Introduction to Machine Learning Project 6

Train a policy

Race with a policy

Results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages