Introduction to Machine Learning Project 6

Reinforcement learning: Value Iteration, SARSA, and Q-learning on the racetrack problem

Train a policy

To train a Q-learning policy:

python exp_Q.py

To train a SARSA policy:

python exp_SARSA.py

To train a value iteration policy:

python exp_VI.py

This results in an arrays directory which contains the policy at various points during training as well as a

Race with a policy

Once a policy is trained, it can be used in a race.

python race.py

This produces a race like so:

Results

Plot of learning curves for Q-Learning on different tracks and with different crash behavior. A "normal crash" means that if the car crashes into a wall, it returns to the last valid track square, whereas a "bad crash" means that it returns to the starting line upon crashing.

Example race:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Introduction to Machine Learning Project 6

Train a policy

Race with a policy

Results

Files

README.md

Latest commit

History

README.md

File metadata and controls

Introduction to Machine Learning Project 6

Train a policy

Race with a policy

Results