๐ Sampling Efficient Deep Reinforcement Learning through Preference-Guided Stochastic Exploration
๐ซ This work proposes a generalized and efficient epsilon-greedy exploration policy to learn a multimodal distribution that aligns with landscape of the Q value.
๐ง Realized in Ubuntu 20.04 and Pytorch over OpenAI Gym benchamark: Atari Game and Classic Control.
If you find this repository useful for your research, please consider starring โญ our repo and citing our paper.
@ARTICLE{huang2023preference,
author={Huang, Wenhui and Zhang, Cong and Wu, Jingda and He, Xiangkun and Zhang, Jie and Lv, Chen},
journal={IEEE Transactions on Neural Networks and Learning Systems},
title={Sampling Efficient Deep Reinforcement Learning Through Preference-Guided Stochastic Exploration},
year={2023},
volume={},
number={},
pages={1-12},
doi={10.1109/TNNLS.2023.3317628}}
๐ An advanced version of this work for addressing autonomous driving decision-making problem can be found in UnaDQN
Autonomous Driving: The performance of PGDQN is also verified through various autonomous driving simulators.
202310091343.mp4
Pong_Comp.mp4
CrazyClimber_Comp.mp4
FishingDerby_Comp.mp4
cd to your workspace and clone the repo.
git clone https://github.com/OscarHuangWind/Preference-Guided-DQN-Atari.git
cd ~/$your workspace/Preference-Guided-DQN-Atari
conda env create -f virtual_env.yml
If you don't use anaconda, then please manually install the dependencies wrote in virtual_env.aml file. You are free to modify your virtual environment name and dependencies in virtual_env.yml file.
conda activate pgdqn
Select the correct version based on your cuda version and device (cpu/gpu):
conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.3 -c pytorch
Download Roms.rar from the Atari 2600 VCS ROM Collection
Extract the .rar file and import them
python -m atari_py.import_roms <path to extracted folder>
python main.py
Modify the VISUALIZATION parameter in config.yaml file to "True", and run:
python main.py
Modify the corresponding name of the algo in main.py file, and run:
python main.py
Feel free to play with the parameters in config.yaml.