Click here to Skip to main content
13,448,124 members (54,487 online)
Click here to Skip to main content
Add your own
alternative version

Tagged as


4 bookmarked
Posted 29 Dec 2017

Monte Carlo Tree Search

, 2 Jan 2018
Rate this:
Please Sign up or sign in to vote.
Using Monte Carlo simulation to find the best state inside tree of actions


The Monte Carlo Method

Monte Carlo methods (or Monte Carlo experiments) are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results.

In other words, Monte Carlo simulation allows you to build forecasts models by randomly sampling given collection of states. Which can be very useful in building a vast range of forecasts and game AI development.

Monte Carlo tree search consists of 4 steps:

  1. Tree Traversal

    In the first step, we are starting from State(0), we have to ask: “Is current Node is a leaf node? (has not been expanded yet)”

    If yes, then “is the current value of node == 0” if yes, then go directly to rollout step.
    Otherwise, if it is not a leaf node, then set the current state first child that maximizes UCB1 (See example project), repeat this step until leaf node will be found...

  2. Node Expansion

    A second step, now we have to ask “If it is a leaf node” and it’s value > 0, then we do expansion:
    For each available action, from current, add new state to tree, set current = first new added node.

  3. Rollout (Simulation)

    Rollout is a process of random sampling available actions for the given state until we get terminal state “Win/Lose”.

  4. Backpropagation

    Finally, we backpropagate last result and number of visits back to the parent node. The algorithm running infinity or by a predefined number of iterations.

Using the Code

The full source code is available at github.

In the example project, we traverse a tree of actions for (N) number of iterations to find the best action for a player object.

Create Nodes and available actions for each node:

var s1 = new Node("nod1");
s1.AddAction(new NodeAction(1, true)); 
var s2 = new Node("nod2");
s2.AddAction(new NodeAction(1, true));
s2.AddAction(new NodeAction(10, false)); 
var s3 = new Node("nod3");
s3.AddAction(new NodeAction(10, true));
s3.AddAction(new NodeAction(20, false)); 
var s4 = new Node("nod4");
s4.AddAction(new NodeAction(10, true));
s4.AddAction(new NodeAction(2, false)); 
var s5 = new Node("nod5");
s5.AddAction(new NodeAction(1, true));
s5.AddAction(new NodeAction(2, false)); 
var s6 = new Node("nod6");
s6.AddAction(new NodeAction(10, true)); 

var s7 = new Node("nod7");
s7.AddAction(new NodeAction(30, true));   // Winning actions (true) 

var s8 = new Node("Node8");
s8.AddAction(new NodeAction(1, true)); 
var s9 = new Node("Node9");
s9.AddAction(new NodeAction(1, true));   


var simulation = new Simulation((int visit, long time) => time < 2000);
simulation.Simulate(s1, new Player()); 


-nod1=>75244 - 0
|  -nod2=>36947 - 2.02481632223913
|  |  -nod4=>18473 - 2.03417711163837
|  |  -nod5=>18473 - 2.03368991410683
|  -nod3=>38296 - 2.02481639471569   <--- winning node
|     -nod6=>38293 - 2.02316377512226
|     -nod7=>2 - 18.7485536367855    <--- winning action
|        -Node8=>1 - 2.17741002251547
|           -Node9=>0 - 1.79769313486232E+308
|              -Node10=>0 - 0

After we've got simulation results, we could apply for example A* algorithm to retrieve the best action path.

Points of Interest

You also might be interested in the following video: Monte Carlo Tree Search By John Levine


This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


About the Author

Denis Pashkov
United States United States
No Biography provided

You may also be interested in...

Comments and Discussions

-- There are no messages in this forum --
Permalink | Advertise | Privacy | Terms of Use | Mobile
Web03 | 2.8.180318.3 | Last Updated 2 Jan 2018
Article Copyright 2017 by Denis Pashkov
Everything else Copyright © CodeProject, 1999-2018
Layout: fixed | fluid