Skip to content

Naive implementation of q-learning on Carpole-v1 in openai-gym

Notifications You must be signed in to change notification settings

AOZMH/Q-Learning-for-Cartpole-V1

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Q-Learning-for-Cartpole-V1

Naive implementation of q-learning on Carpole-v1 in openai-gym

Execution

  • Training

Configurate the value of parameter train_or_eval at the bottom of main.py to 'train', set up other hyper-parameters.

python main.py

  • Evaluating

To test the rate at which the model can survive no less than 200 steps. Configurate the value of parameter train_or_eval of main.py to 'eval', change checkpoint_q_table to the route of the well-trained q-table to be evaluated, set num_trials to the number of runs when evaluating the model.

python main.py

  • Illustrating

To continuously run one episode until the pole falls down or the cart moves away and illustrate the process on a window, no early stopping on 200 steps. Configurate the parameter checkpoint_q_table of test_and_illustrate.py to the q-table file to be tested (e.g. data/q_table_02lr.npy), guarantee that the state numbers accord with q-table.

python test_and_illustrate.py

Reuslts

  • Success rate @ 200 steps

For the provided q-table (data/q_table_02lr.npy), 100000 of 100000 times can the model persist over 200 steps.

  • How long can it survive in one episode?

Due to the randomness of the environment, our experiments show that the model can survive for nearly infinite steps (>400000 steps) sometimes (6/10 trials), while stopping at 800-2000 steps for other trials. Interestingly, the model seemingly rarely fails after 2000 steps.

References

About

Naive implementation of q-learning on Carpole-v1 in openai-gym

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages