The implementation of multi-agent reinforcement learning algorithm in Pytorch, including: Grid-Wise Control, Qmix, Centralized PPO. Different learning strategies can be specified during training, and model and experimental data can be saved.
Quick Start: Run the main.py script to start training. Please specify all parameters in the config.yaml file (The parameters used in this project are not optimal parameters, please adjust them according to the actual requirement).
MPE: Multi Particle Environments (MPE) are a set of communication oriented environment where particle agents can (sometimes) move, communicate, see each other, push each other around, and interact with fixed landmarks.
These environments are from OpenAI’s MPE codebase, with several minor fixes, mostly related to making the action space discrete by default, making the rewards consistent and cleaning up the observation space of certain environments.
The environment applied in this project is Simple Spread (I'm also considering adding other environments in future releases).
Note: The following are suggested versions only, and do not mean the program will not work with other versions.
Name | Version |
---|---|
Python | 3.10.9 |
gymnasium | 0.28.1 |
numpy | 1.23.5 |
PettingZoo | 1.23.0 |
Pytorch | 1.12.1 |
Update on 4.10.2023: Pytorch 2.0.0+cu118 on python 3.9.16 works. Please notice that python >3.9 won't work because PettingZoo 1.12.0 is not available.
-
Grid-Wise Control for Multi-Agent Reinforcement Learning in Video Game AI
-
QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
-
The Surprising Effectiveness of PPOin Cooperative Multi-Agent Games
- petting zoo:
@article{terry2020pettingzoo,
Title = {PettingZoo: Gym for Multi-Agent Reinforcement Learning},
Author = {Terry, J. K and Black, Benjamin and Grammel, Nathaniel and Jayakumar, Mario and Hari, Ananth and Sulivan, Ryan and Santos, Luis and Perez, Rodrigo and Horsch, Caroline and Dieffendahl, Clemens and Williams, Niall L and Lokesh, Yashas and Sullivan, Ryan and Ravi, Praveen},
journal={arXiv preprint arXiv:2009.14471},
year={2020}
}