Skip to content

The Reinforcement Learning environment for AI research in μRTS, a Real-time Strategy game simulator.

License

Notifications You must be signed in to change notification settings

Agents-Bar/gym-microrts

 
 

Repository files navigation

Gym-μRTS (pronounced "gym-micro-RTS")

This repository is a fork of Costa's repository which provides an OpenAPI gym compatible interface over μRTS environment authored by Santiago Ontañón.

Note that this repository only provides the environment. To see agents in training and action please see the original repository.

Visualisation of an actual game

Technical Paper

Before diving into the code, we highly recommend reading the preprint of our paper: Gym-μRTS: Toward Affordable Deep Reinforcement Learning Research in Real-time Strategy Games

Depreciation note

Note that the experiments in the technical paper above are done with gym_microrts==0.3.2. As we move forward beyond v0.4.x, we are planing to deprecate UAS despite its better performance in the paper. This is because UAS has more complex implementation and makes it really difficult to incorporate selfplay or imitation learning in the future.

Get Started

# Make sure you have Java 8.0+ installed
$ pip install gym_microrts --upgrade

The quickest way to start is to run and modify provided examples in examples directory. For example, to run hello_world.py either move to the examples directory and run python hello_world.py, or from the root of this repository run python -m examples.hello_world.

For running a partial observable example, run the hello_world_po.py in this repo.

Environment Specification

Here is a description of Gym-μRTS's observation and action space:

  • Observation Space. (Box(0, 1, (h, w, 27), int32)) Given a map of size h x w, the observation is a tensor of shape (h, w, n_f), where n_f is a number of feature planes that have binary values. The observation space used in this paper uses 27 feature planes as shown in the following table. A feature plane can be thought of as a concatenation of multiple one-hot encoded features. As an example, if there is a worker with hit points equal to 1, not carrying any resources, owner being Player 1, and currently not executing any actions, then the one-hot encoding features will look like the following:

    [0,1,0,0,0], [1,0,0,0,0], [1,0,0], [0,0,0,0,1,0,0,0], [1,0,0,0,0,0]

    The 27 values of each feature plane for the position in the map of such worker will thus be:

    [0,1,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0]

  • Partial Observation Space. (Box(0, 1, (h, w, 29), int32)) Given a map of size h x w, the observation is a tensor of shape (h, w, n_f), where n_f is a number of feature planes that have binary values. The observation space for partial observability uses 29 feature planes as shown in the following table. A feature plane can be thought of as a concatenation of multiple one-hot encoded features. As an example, if there is a worker with hit points equal to 1, not carrying any resources, owner being Player 1, currently not executing any actions, and not visible to the opponent, then the one-hot encoding features will look like the following:

    [0,1,0,0,0], [1,0,0,0,0], [1,0,0], [0,0,0,0,1,0,0,0], [1,0,0,0,0,0], [1,0]

    The 29 values of each feature plane for the position in the map of such worker will thus be:

    [0,1,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,1,0]

  • Action Space. (MultiDiscrete([hw 6 4 4 4 4 7 a_r])) Given a map of size h x w and the maximum attack range a_r=7, the action is an 8-dimensional vector of discrete values as specified in the following table. The first component of the action vector represents the unit in the map to issue actions to, the second is the action type, and the rest of components represent the different parameters different action types can take. Depending on which action type is selected, the game engine will use the corresponding parameters to execute the action. As an example, if the RL agent issues a move south action to the worker at $x=3, y=2$ in a 10x10 map, the action will be encoded in the following way:

    [3+2*10,1,2,0,0,0,0,0 ]

image

Preset Envs:

Gym-μRTS comes with preset environments for common tasks as well as engaging the full game. Feel free to check out the following benchmark:

Below are the difference between the versioned environments

use frame skipping complete invalid action masking issuing actions to all units simultaneously map size
v1 frame skip = 9 only partial mask on source unit selection no 10x10
v2 no yes yes 10x10
v3 no yes yes 16x16

Setting API server

To initiate the API server you first need to install all dependencies in the api extras. This installation can be done through poetry install --extras api. Once the installation is successfull you can initiate the server through

uvicorn gym_microts.api:app

The command above will create an endpoint available on localhost:8000. Go to http://localhost:8000/docs to see available API and how to call them.

Developer Guide

Highly suggested to use a different environment than the global. For example, to set up and activate python's official virtual environment execute

python -m venv .venv
source .venv/bin/activate

This creates .venv directory and all packages will be under .venv/.

Submodule

For running tests you might need to checkout microrts. Since it's included as a submodule you can check it out using

git submodule update --init --recursive

Java

To run this environment you need to bundle java code into jar so that jpype can import it. Once you checkout microrts you need to execute the build.sh script from within the microrts directory. As mentioned above, the bundle jar needs to be created with Java 8. Current LTS is 11 and Sept 2021 will release 17, but Java 8 is supported until 2030 (?!?!).

In case you are using Ubuntu you can install java using sudo apt install openjdk-8-jdk. This should work even if you have newer Java version but then you need to switch current java version using

sudo update-alternatives --config java
sudo update-alternatives --config javac

or update link to /usr/bin/javac from `/usr/lib/jvm

Other

Required dev environment

# install pyenv
curl https://pyenv.run | bash
echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bashrc
echo 'export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bashrc
echo 'eval "$(pyenv init --path)"' >> ~/.bashrc
echo 'eval "$(pyenv init -)"' >> ~/.bashrc
echo 'eval "$(pyenv virtualenv-init -)"' >> ~/.bashrc
source ~/.bashrc

# install python 3.9.5
pyenv install 3.9.5
pyenv global 3.9.5

# install pipx
pip install pipx

# install other dev dependencies
pipx install poetry
pipx instal isort
pipx install black
pipx install autoflake
pipx ensurepath
# install gym-microrts
$ git clone --recursive https://github.com/vwxyzjn/gym-microrts.git && \
cd gym-microrts 
pyenv install -s $(sed "s/\/envs.*//" .python-version)
pyenv virtualenv $(sed "s/\/envs\// /" .python-version)
poetry install
# build microrts
cd gym_microrts/microrts && bash build.sh > build.log && cd ..&& cd ..
python hello_world.py

Known issues

[ ] Rendering does not exactly work in macos. See jpype-project/jpype#906

Papers written using Gym-μRTS

About

The Reinforcement Learning environment for AI research in μRTS, a Real-time Strategy game simulator.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Languages

  • Python 95.9%
  • Dockerfile 2.6%
  • Shell 1.5%