Trained without any extrinsic reward
Please check the report for details of this work.
- python3
- gym
- gym-super-mario-bros
- OpenCV
- PyTorch
- tensorboardX
Train network with a separate controller(Original model but with LSTM as the forward network):
python3 train.py
Train network with controller with shared features with ICM (Our model):
python3 train.py --shared_features
Evaluate network with a separate controller (Original model but with LSTM as the forward network):
python3 eval.py --name eta-0.2_stack-1_sparse_extrinsic_run1 --number 5734400
Evaluate network with controller with shared features with ICM (Our model):
python3 eval.py --name eta-0.2_rnn_forward_both_shared_features_stack-1_only_intrinsic_gradients_feat_run3 --number 5734400
Code has been heavily borrowed from the first two. Thanks a lot!
https://github.com/ctallec/world-models
https://github.com/jcwleo/curiosity-driven-exploration-pytorch