Skip to content

Code for NeurIPS 2020 paper 'Self-imitation Learning via Generalized Lower bound Q-learning'

Notifications You must be signed in to change notification settings

robintyh1/nstep-sil

Repository files navigation

Code for NeurIPS 2020 paper: Self-imitation Learning via Generalized Lower Bound Q-learning

Dependencies

This implementation depends on the following libraries as well as dependencies that support these libraries.

To run experiments with simulated environments, you will also need to install

Run the code

Hyper-parameters are specified in the python code. After running experiments, performance curves will be saved in a sub-directory in the current working directory for further processing.

For example, to run the nstep SIL algorithm with delayed environments, run the following

python td3_nstep_sil.py --env HalfCheetah-v3 --seed 100 --delay 3 --nstep 5 --sil-weights 0.1

To run without SIL, set the proper hyper-parameter

python td3_nstep_sil.py --env HalfCheetah-v3 --seed 100 --delay 3 --nstep 5 --sil-weights 0.0

To run the return based SIL algorithm with delayed environments, run the following

python td3_return_sil.py --env HalfCheetah-v3 --seed 100 --delay 3 --nstep 5 --sil-weights 0.1

Citations

If you find this code base useful, you are encouraged to cite the following

  • Yunhao Tang, "Self-imitation Learning via Generalized Lower Bound Q-learning". arXiv:2006.07442 [cs.LG], 2020.

About

Code for NeurIPS 2020 paper 'Self-imitation Learning via Generalized Lower bound Q-learning'

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages