Skip to content
forked from pbsinclair42/MCTS

A simple package to allow users to run Monte Carlo Tree Search on any perfect information domain

License

Notifications You must be signed in to change notification settings

Anton-Gasse/MCTS

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MCTS

This package provides a simple way of using Monte Carlo Tree Search in any perfect information domain.

Installation

With pip: pip install mcts

Without pip: Download the zip/tar.gz file of the latest release, extract it, and run python setup.py install

Quick Usage

In order to run MCTS, you must implement a State class which can fully describe the state of the world. It must also implement four methods:

  • getCurrentPlayer(): Returns 1 if it is the maximizer player's turn to choose an action, or -1 for the minimiser player
  • getPossibleActions(): Returns an iterable of all actions which can be taken from this state
  • takeAction(action): Returns the state which results from taking action action
  • isTerminal(): Returns True if this state is a terminal state
  • getReward(): Returns the reward for this state. Only needed for terminal states.

You must also choose a hashable representation for an action as used in getPossibleActions and takeAction. Typically this would be a class with a custom __hash__ method, but it could also simply be a tuple or a string.

Once these have been implemented, running MCTS is as simple as initializing your starting state, then running:

from mcts import mcts

searcher = mcts(initialState=initialState, timeLimit=1000)
bestAction = searcher.search()
searcher.save(path='./root')

newSearcher = mcts(path="./root", iterationLimit=80000)
action = newSearcher.search()

print(action)

Here the unit of timeLimit=1000 is millisecond. You can also use iterationLimit=1600 to specify the number of rollouts. Exactly one of timeLimit and iterationLimit should be specified. The expected reward of best action can be got by setting needDetails to True in searcher.

You can save the current progress with the save() method and create a new instance providing the path to load the progress.

resultDict = searcher.search(needDetails=True)
print(resultDict.keys()) #currently includes dict_keys(['action', 'expectedReward'])

See naughtsandcrosses.py for a simple example.

Slow Usage

//TODO

Collaborating

Feel free to raise a new issue for any new feature or bug you've spotted. Pull requests are also welcomed if you're interested in directly improving the project.

About

A simple package to allow users to run Monte Carlo Tree Search on any perfect information domain

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%