Skip to content

Latest commit

 

History

History
44 lines (33 loc) · 2.34 KB

README.md

File metadata and controls

44 lines (33 loc) · 2.34 KB

Generative Adversarial Imitation Learning in tensorflow on PySC2

Tensorflow implementation of Generative Adversarial Imitation Learning, and apply GAIL on PySC2

disclaimers: some code is borrowed from @openai/baselines and @andrewliao

What's GAIL?

  • model free imtation learning -> low sample efficiency in training time
    • model-based GAIL: End-to-End Differentiable Adversarial Imitation Learning
  • Directly extract policy from demonstrations
  • Remove the RL optimization from the inner loop od inverse RL
  • Some work based on GAIL:
    • Inferring The Latent Structure of Human Decision-Making from Raw Visual Inputs
    • Multi-Modal Imitation Learning from Unstructured Demonstrations using Generative Adversarial Nets
    • Robust Imitation of Diverse Behaviors

Requirements

  • python==3.5.2
  • tensorflow==1.1.0
  • gym==0.9.3
  • pysc2

Run the code

Actions in PySC2 is composed of action id and extra parameters, eg to move a minion, RL agents need to provide corresponding action id and coordinates on map. I use GAIL to learn to choose reasonable action id, and use a separate supervised learning neural network to obtain correct parameters.

To get an idea of how I parse the .SC2Replay files, refer to [parse recording file]

The trained parameter network should be put under param_pre_model. The pre-trained model is trained by running the codes in [parameter model], this pretrained model is used to supply the parameters for each

In master branch, run python3 main.py to start training, the model will be saved every 100 episode

To evaluate, git checkout UsePPOParameterSharingEvaluate to evaluate the model, the trained model should be put in /checkpoint.

Result

The result is not quite ideal, the agents only learns to construct building and have a few minions patrol around.

Reference

To inspect this project in detail, proceed to the [report]

  • Jonathan Ho and Stefano Ermon. Generative adversarial imitation learning, [arxiv]
  • @openai/imitation
  • @openai/baselines

Feel free to contact me if you want the trained model for the pretrained parameter network and GAIL network