Skip to content

Latest commit

 

History

History
50 lines (47 loc) · 21.1 KB

games.md

File metadata and controls

50 lines (47 loc) · 21.1 KB

Available games

: thoroughly-tested. In many cases, we verified against known values and/or reproduced results from papers.

~: implemented but lightly tested.

Game Reference Status
Backgammon Wikipedia
Breakthrough Wikipedia
Bridge Wikipedia
(Uncontested) Bridge bidding Wikipedia
Catch Mnih et al. 2014, Recurrent Models of Visual Attention,
Osband et al '19, Behaviour Suite for Reinforcement Learning, Appendix A
~
Cliff Walking Sutton et al. '18, page 132 ~
Clobber Wikipedia ~
Coin Game Raileanu et al. '18, Modeling Others using Oneself in Multi-Agent Reinforcement Learning ~
Connect Four Wikipedia
Cooperative Box-Pushing Seuken & Zilberstein '12, Improved Memory-Bounded Dynamic Programming for Decentralized POMDPs ~
Chess Wikipedia
Deep Sea Osband et al. '17, Deep Exploration via Randomized Value Functions ~
First-price
Sealed-bid Auction
Wikipedia
Gin Rummy Wikipedia
Go Wikipedia
Goofspiel Wikipedia
Hanabi (via Hanabi Learning Environment) Wikipedia and Bard et al. '19, The Hanabi Challenge: A New Frontier for AI Research
Havannah Wikipedia
Hearts Wikipedia ~
Hex Wikipedia ~
Kuhn poker Wikipedia
Laser Tag Leibo et al. '17, Lanctot et al. '17 ~
Leduc poker Southey et al. '05, Bayes’ bluff: Opponent modelling in poker
Lewis Signaling Wikipedia ~
Liar's Dice Wikipedia
Markov Soccer Littman '94, Markov games as a framework for multi-agent reinforcement learning,
He et al. '16, Opponent Modeling in Deep Reinforcement Learning
~
Matching Pennies
(three-player)
"Three problems in learning mixed-strategy Nash equilibria"
Negotiation Lewis et al. '17, Cao et al. '18
Oshi-Zumo Buro, 2004. Solving the oshi-zumo game
Bosansky et al. '16, Algorithms for Computing Strategies in Two-Player Simultaneous Move Games
Oware Wikipedia
Pentago Wikipedia
Phantom Tic-Tac-Toe Auger '11, Multiple Tree for Partially Observable Monte-Carlo Tree Search,
Lisy '14, Alternative Selection Functions for Information Set Monte Carlo Tree Search,
Lanctot '13
~
Pig Wikipedia
Poker (Hold'em, via ACPC code base) Wikipedia ~
Quoridor Wikipedia
Skat (simplified bidding) Wikipedia ~
Tic-Tac-Toe Wikipedia
Tiny Bridge
Tiny Hanabi Foerster et al 2018, Bayesian Action Decoder for Deep Multi-Agent Reinforcement Learning
Trade Comm A simple emergent communication game based on trading.
Y Wikipedia