Aligning Text and Embodied Environments for Interactive Learning
Mohit Shridhar, Xingdi (Eric) Yuan, Marc-Alexandre Côté,
Yonatan Bisk, Adam Trischler, Matthew Hausknecht
ALFWorld contains interactive TextWorld environments (Côté et. al) that parallel embodied worlds in the ALFRED dataset (Shridhar et. al). The aligned environments allow agents to reason and learn high-level policies in an abstract space before solving embodied tasks through low-level actuation.
For the latest updates, see: alfworld.github.io
❗Work in progress❗
Clone repo:
$ git clone https://github.com/alfworld/alfworld.git alfworld
$ export ALFRED_ROOT=$(pwd)/alfworld
Install requirements:
# Note: Requires python 3.6 or higher
$ virtualenv -p $(which python3.6) --system-site-packages alfworld_env # or whichever package manager you prefer
$ source alfworld_env/bin/activate
$ cd $ALFRED_ROOT
$ pip install --upgrade pip
$ pip install -r requirements.txt
Download PDDL & Game Files and pre-trained MaskRCNN detector:
$ sh $ALFRED_ROOT/data/download_data.sh
Train models:
$ cd $ALFRED_ROOT/agents
$ python dagger/train_dagger.py config/base_config.yaml
Play around with TextWorld and THOR demos.
- Data: PDDL, Game Files, Pre-trained Agents. Generating PDDL states and detection training images.
- Agents: Training and evaluating TextDAgger, TextDQN, VisionDAgger agents.
- Explore: Play around with ALFWorld TextWorld and THOR environments.
- Python 3.6
- PyTorch 1.2.0
- Torchvision 0.4.0
- AI2THOR 2.1.0
See requirements.txt for all prerequisites
Tested on:
- GPU - GTX 1080 Ti (12GB)
- CPU - Intel Xeon (Quad Core)
- RAM - 16GB
- OS - Ubuntu 16.04
Install Docker and NVIDIA Docker.
Modify docker_build.py and docker_run.py to your needs.
Build the image:
$ python docker/docker_build.py
For local machines:
$ python docker/docker_run.py
source ~/alfworld_env/bin/activate
cd $ALFRED_ROOT
For headless VMs and Cloud-Instances:
$ python docker/docker_run.py --headless
# inside docker
tmux new -s startx # start a new tmux session
# start nvidia-xconfig (might have to run this twice)
sudo nvidia-xconfig -a --use-display-device=None --virtual=1280x1024
sudo nvidia-xconfig -a --use-display-device=None --virtual=1280x1024
# start X server on DISPLAY 0
sudo python ~/alfworld/docker/startx.py 0 # if this throws errors e.g "(EE) Server terminated with error (1)" or "(EE) already running ..." try a display > 0
# detach from tmux shell
# Ctrl+b then d
# source env
source ~/alfworld_env/bin/activate
# set DISPLAY variable to match X server
export DISPLAY=:0
# check THOR
cd $ALFRED_ROOT
python docker/check_thor.py
###############
## (300, 300, 3)
## Everything works!!!
You might have to modify X_DISPLAY
in gen/constants.py depending on which display you use.
ALFWorld can be setup on headless machines like AWS or GoogleCloud instances. The main requirement is that you have access to a GPU machine that supports OpenGL rendering. Run startx.py in a tmux shell:
# start tmux session
$ tmux new -s startx
# start X server on DISPLAY 0
$ sudo python $ALFRED_ROOT/scripts/startx.py 0 # if this throws errors e.g "(EE) Server terminated with error (1)" or "(EE) already running ..." try a display > 0
# detach from tmux shell
# Ctrl+b then d
# set DISPLAY variable to match X server
$ export DISPLAY=:0
# check THOR
$ cd $ALFRED_ROOT
$ python docker/check_thor.py
###############
## (300, 300, 3)
## Everything works!!!
You might have to modify X_DISPLAY
in gen/constants.py depending on which display you use.
Also, checkout this guide: Setting up THOR on Google Cloud
ALFWorld
@inproceedings{ALFWorld20,
title ={{ALFWorld: Aligning Text and Embodied
Environments for Interactive Learning}},
author={Mohit Shridhar and Xingdi Yuan and
Marc-Alexandre C\^ot\'e and Yonatan Bisk and
Adam Trischler and Matthew Hausknecht},
booktitle = {arXiv},
year = {2020},
url = {https://arxiv.org/abs/2010.03768}
}
ALFRED
@inproceedings{ALFRED20,
title ={{ALFRED: A Benchmark for Interpreting Grounded
Instructions for Everyday Tasks}},
author={Mohit Shridhar and Jesse Thomason and Daniel Gordon and Yonatan Bisk and
Winson Han and Roozbeh Mottaghi and Luke Zettlemoyer and Dieter Fox},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2020},
url = {https://arxiv.org/abs/1912.01734}
}
TextWorld
@inproceedings{cote2018textworld,
title={Textworld: A learning environment for text-based games},
author={C{\^o}t{\'e}, Marc-Alexandre and K{\'a}d{\'a}r, {\'A}kos and Yuan, Xingdi and Kybartas, Ben and Barnes, Tavian and Fine, Emery and Moore, James and Hausknecht, Matthew and El Asri, Layla and Adada, Mahmoud and others},
booktitle={Workshop on Computer Games},
pages={41--75},
year={2018},
organization={Springer}
}
GNU General Public License (GPL) v3.0
Questions or issues? File an issue or contact Mohit Shridhar