In this page we provide the code and all the resources related to the paper Continual Reinforcement Learning in 3D Non-stationary Environments. If you plan to use some of the resources you'll find in this page, please cite our latest paper:
@article{lomonaco2019,
title={Continual Reinforcement Learning in 3D Non-stationary Environments},
author={Lomonaco, Vincenzo and Desai, Karen and Culurciello, Eugenio and Maltoni, Davide},
journal={arXiv preprint arXiv:1905.10112},
year={2019}
}
In order to extecute the code in the repository you'll need to install the following dependencies in a Python 3.x environment:
- Basic dependences:
pip3 install numpy scipy matplotlib sacred pymongo
- ViZDoom: RL API wrapper to ZDoom
# Install VizDoom
git clone https://github.com/mwydmuch/ViZDoom ${HOME_DIR}/vizdoom
pip3 install ${HOME_DIR}/vizdoom
- Sacred: Experiments Manager
pip3 install sacred
- PyTorch: Deep Learning framework
pip3 install http://download.pytorch.org/whl/cu80/torch-0.3.1-cp35-cp35m-linux_x86_64.whl
pip3 install torchvision
Up to now the projects is structured as follows:
src/
: The actual code for trainig and testing the agents.cfgs/
: The configuration files for the hyper-parameters and the environment settings.artifacts/
: It will be created after the setup to maintain the artifacts created by the experiments.scripts/
: Scripts for the easy setup and run.LICENSE
: Standard Creative Commons Attribution 4.0 International License.README.md
: This instructions file.
First of all, let's clone the repository:
git clone https://github.com/vlomonaco/crlmaze.git
Then, in order to run the experiments reported in the paper:
cd clrmaze
sudo chmod a+x scripts/*
./scripts/setup.sh
After this initial step you can directly run the experiments with the bash scripts ./scripts/run_exps.sh
for CRL implemented baselines. Since this experiments can take a while (also more than 40h) you can also disable some experiments just by commenting them in the bash script.
-
If you find different results from out benchmark (for a few percentage points) that is to be expected! First of all because we use the
cudnn
engine which is not fully deterministic for convolutions. Second because the error may be accumulated during the incremental learning process. If you want full reproducibility"backend"
parameter in the configuration files to"CPU"
. -
Hey! If you find any trouble don't get frustrated, just ask, we'll answer in a few hours! :-)
This work is licensed under a Creative Commons Attribution 4.0 International License.
- Vincenzo Lomonaco - email: [email protected]
- Karan Desai - email: [email protected]
- Eugenio Culurciello - email: [email protected]
- Davide Maltoni - email: [email protected]