Skip to content

Latest commit

 

History

History
239 lines (195 loc) · 13.1 KB

contributing.md

File metadata and controls

239 lines (195 loc) · 13.1 KB

Guidelines

Above all, OpenSpiel is designed to be easy to install and use, easy to understand, easy to extend (“hackable”), and general/broad. OpenSpiel is built around two major important design criteria:

  • Keep it simple. Simple choices are preferred to more complex ones. The code should be readable, usable, extendable by non-experts in the programming language(s), and especially to researchers from potentially different fields. OpenSpiel provides reference implementations that are used to learn from and prototype with, rather than fully-optimized / high-performance code that would require additional assumptions (narrowing the scope / breadth) or advanced (or lower-level) language features.

  • Keep it light. Dependencies can be problematic for long-term compatibility, maintenance, and ease-of- use. Unless there is strong justification, we tend to avoid introducing dependencies to keep things easy to install and more portable.

Support expectations

We, the OpenSpiel authors, definitely engage in supporting the community. As it can be time-consuming, we try to find a good balance between ensuring we are responsive and being able to continue to do our day-to-day work and research.

Generally speaking, if you are willing to get a specific feature implemented, the most effective way is to implement it and send a Pull Request. For large changes, or ones involving design decisions, open a bug to check the idea is ok first.

The higher the quality, the easier it will be to be accepted. For instance, following the C++ Google style guide and Python Google style guide will help with the integration.

As examples, MacOS support, Window support, example improvements, various bug-fixes or new games has been straightforward to be included and we are very thankful to everyone who helped.

Bugs

We aim to answer bugs at a reasonable pace, several times a week. However, for bugs involving large changes (e.g. adding new games, adding public state supports) we cannot commit to implementing it and encourage everyone to contribute directly.

Pull requests

You can expect us to answer/comment back and you will know from the comment if it will be merged as is or if it will need additional work.

For pull requests, they are merged as batches to be more efficient, at least every two weeks (for bug fixes, it will likely be faster to be integrated). So you may need to wait a little after it has been approved to actually see it merged.

Roadmap and Call for Contributions

Contributions to this project must be accompanied by a Contributor License Agreement (CLA). See CONTRIBUTING.md for the details.

Here, we outline our intentions for the future, giving an overview of what we hope to add over the coming years. We also suggest a number of contributions that we would like to see, but have not had the time to add ourselves.

Before making a contribution to OpenSpiel, please read the guidelines. We also kindly request that you contact us before writing any large piece of code, in case (a) we are already working on it and/or (b) it's something we have already considered and may have some design advice on its implementation. Please also note that some games may have copyrights which might require legal approval. Otherwise, happy hacking!

The following list is both a Call for Contributions and an idealized road map. We certainly are planning to add some of these ourselves (and, in some cases already have implementations that were just not tested well enough to make the release!). Contributions are certainly not limited to these suggestions!

  • AlphaZero. An implementation of AlphaZero. Preferably, an implementation that closely matches the pseudo-code provided in the paper.

  • Checkers / Draughts. This is a classic game and an important one in the history of game AI ("Checkers is solved").

  • Chinese Checkers / Halma. Chinese Checkers is the canonical multiplayer (more than two player) perfect information game. Currently, OpenSpiel does not contain any games in this category.

  • Correlated Equilibrium. There is a simple linear program that can be solved to find a correlated equilibrium in a normal-form game (see Section 4.6 of Shoham & Leyton-Brown '09). This would be a nice complement to the existing solving of zero-sum games in python/algorithms/lp_solver.py.

  • Deep TreeStrap. An implementation of TreeStrap (see Bootstrapping from Game Tree Search), except with a DQN-like replay buffer, storing value targets obtained from minimax searches. We have an initial implementation, but it is not yet ready for release. We also hope to support PyTorch for this algorithm as well.

  • Double Neural Counterfactual Regret Minimization. This is a technique similar to Regression CFR that uses a robust sampling technique and a new network architecture that predicts both the cumulative regret and the average strategy. (Ref)

  • Differentiable Games and Algorithms. For example, Symplectic Gradient Adjustment (Ref).

  • Emergent Communication Algorithms. For example, RIAL and/or DIAL and CommNet.

  • Emergent Communication Games. Referential games such as the ones in Ref1, Ref2, Ref3.

  • Extensive-form Evolutionary Dynamics. There have been a number of different evolutionary dynamics suggested for the sequential games, such as state-coupled replicator dynamics (Ref), sequence-form replicator dynamics (Ref1, Ref2), sequence-form Q-learning (Ref), and the logit dynamics (Ref).

  • Game Query/Customization API. There is no easy way to retrieve game-specific information since all the algorithms interact with the general API only. But sometimes this is necessary, such as when a technique is being tested or specialized on one game. There is also no way to change the representation of observations without changing the implementation of the game. This module would expose game-specific information via queries and customization without having to hack the game implementations directly.

  • General Games Wrapper. There are several general game engine languages and databases of general games that currently exist, for example within the general game-playing project and the Ludii General Game System. A very nice addition to OpenSpiel would be a game that interprets games represented in these languages and presents them as OpenSpiel games. This could lead to the potential of evaluating learning agents on hundreds to thousands of games.

  • Go API. We currently have a prototype Go API similar to the Python API. It is exposed using cgo via a C API much like the CFFI Python bindings from the Hanabi Learning Environment. It is not currently ready for release, but should be possible in a future update.

  • Grid Worlds. There are currently four grid world games in OpenSpiel: Markov soccer, the coin game, cooperative box-pushing, and laser tag. There could be more, especially ones that have been commonly used in multiagent RL. Also, the current grid worlds can be improved (they all are fully-observable).

  • Heuristic Payoff Tables and Empirical Game-Theoretic Analysis. Methods found in Analyzing Complex Strategic Interactions in Multi-Agent Systems, Methods for Empirical Game-Theoretic Analysis, An evolutionary game-theoretic analysis of poker strategies, Ref4.

  • Monte Carlo Tree Search Solver. General enhancement to Monte Carlo tree search, backpropagate proven wins and loses as far up as possible. See Winands el al. '08.

  • Minimax-Q and other classic MARL algorithms. Minimax-Q is a classic multiagent reinforcement learning algorithm (Markov games as a framework for multi-agent reinforcement learning. Other classic algorithms, such as Correlated Q-learning, NashQ, and Friend-or-Foe Q-learning (Friend-or-foe q-learning in general-sum games would be welcome as well.

  • Nash Averaging. An evaluation tool first described in Re-evaluating Evaluation.

  • Negotiation Games. A game similar to the negotiation game presented in Ref1, Ref2. Also, Colored Trails (Modeling how Humans Reason about Others with Partial Information, Metastrategies in the coloredtrails game.

  • Opponent Modeling / Shaping Algorithms. For example, DRON, LOLA, and Stable Opponent Shaping.

  • PyTorch. While we officially support Tensorflow, the API is agnostic to the library that is used for learning. We would like to have some examples and support for PyTorch as well in the future.

  • Repeated Games. There is currently no explicit support for repeated games. Supporting repeated games as one sequential game could be useful for application of RL algorithms. This could take the form of another game transform, where intermediate rewards are given for game instances. It could also support random termination, found in the literature and tournaments.

  • Sequential Social Dilemmas. Sequential social dilemmas, such as the ones found in Ref1, Ref2 . Wolfpack could be a nice one, since pursuit-evasion games have been common in the literature (Ref). Also the coin games from Ref1 and Ref2, and Clamity, Cleanup and/or Harvest from Ref3 Ref4.

  • Single-Agent Games and Environments. There are currently no single-player (i.e. solitaire) games or traditional RL environments implemented (in C++, accessible to the entire code base) despite the API supporting the use case. Games that fit into the category, such as Morpion and Klondike, and traditional RL environments such as grid worlds, that have been used commonly in AI research, would be welcome contributions.

  • Structured Action Spaces. Currently, actions are integers between 0 and some value. There is no easy way to interpret what each action means in a game-specific way. Nor is there any way to easily represent a composite action in terms of its parts. A structured action space could represent actions as a sequence of values (like information states and observations-- and can also include shapes) which can be learned instead of mappings to flat numbers. Then, each game could have a mapping from the structured action to the action taken.

  • TF_Trajectories. The source code currently includes a batch inference for running a batch of episodes using Tensorflow directly from C++ (in contrib/). It has not yet been tested with CMake and public Tensorflow. We would like to officially support this and move it into the core library.