Traversal State Pattern





5.00/5 (1 vote)
A Pattern for Identifying Processed Entities During Iteration Without Additional Collections
Introduction
The TraversalStatePattern
can be used to identify processed entities without requiring additional collections and to reset the IsProcessed
flag after entities have been marked as processed during a full or partial traversal, eliminating the need to iterate through all nodes once again to clear the flag. It can be used in graph traversal or for processing other types of entities.
Background
The idea is to include an integer state value within each Node
, along with a global integer state used for comparison with the Node's state to determine whether the node has been processed. The Node
's state value will be set to the global value to mark it as processed. Before each traversal, we increment the global state number, which effectively resets all nodes to an unprocessed state prior to the traversal.
Example Situation
Example of a graph node to which we want to apply a pattern:
/// <summary>
/// This class contains all logic and data related to its basic functionality
/// </summary>
public partial class ExampleNode
{
public readonly int Value;
#region Graph hierarchy
private readonly List<ExampleNode> _children = new();
public IReadOnlyList<ExampleNode> Children => _children;
public void AddChild(ExampleNode child)
{
_children.Add(child);
}
#endregion
}
Pattern Code
By using a partial class
, we can relocate all pattern logic to a separate file:
/// <summary>
/// This class contains all pattern members necessary to implement TraversalStatePattern
/// </summary>
public partial class ExampleNode
{
private static int _globalTraversalState;
private int _nodeTraversalState = -1;//set to -1 to not be IsProcessed() by default
public static void NewTraversal()
{
_globalTraversalState++;
}
public void MarkProcessed()
{
_nodeTraversalState = _globalTraversalState;
}
public bool IsProcessed() => _nodeTraversalState == _globalTraversalState;
}
We can combine the MarkProcessed()
and IsProcessed()
methods into one. It will only return 'True
' the first time you use it:
public bool CheckSetProcessed()
{
if (_nodeTraversalState == _globalTraversalState) //check is processed
return true;
_nodeTraversalState = _globalTraversalState;
return false;
}
Pattern Usage
Example graph traversal method:
public void GraphTraversal(ExampleNode root, Action<ExampleNode> procedure)
{
//Start a new traversal. Now method CheckSetProcessed() in all nodes
//will return False after this
ExampleNode.NewTraversal();
var queue = new Queue<ExampleNode>();//Using Queue for BFS
//(or Stack can be used for DFS traversal)
//process start node
queue.Enqueue(root);
root.CheckSetProcessed();
while (queue.TryDequeue(out var node))
{
foreach (var resultChild in node.Children)
{
//Use status pattern
if (resultChild.CheckSetProcessed())
{
continue;//skip node that has been already processed
}
//Do some work
procedure(resultChild);
//process node children
queue.Enqueue(resultChild);
}
}
}
Multithreading Usage
For multithreading processing, we should modify the code a bit:
public partial class ExampleNode
{
private static int _globalTraversalState;
private int _nodeTraversalState = -1;
public static void NewTraversal()
{
_globalTraversalState++;
}
public bool CheckSetProcessed()
{
var newState = Interlocked.Exchange
(ref _nodeTraversalState, _globalTraversalState);
//if the original value was now equal to global, then node was not processed
return newState == _globalTraversalState;
}
}
Here's an example of how to use it in code that runs in multiple threads. I used BackgroundWorkers
, but you can use other methods like Tasks
:
public void GraphTraversal(
ExampleNode root,
Action<ExampleNode> procedure,
int threads,
int nodes)
{
ExampleNode.NewTraversalMultithreaded(); //Start a new traversal.
//Now all nodes IsProcessed() will give False
//Do multithreaded traversal
var waitEvent = new CountdownEvent(threads);
var processingBC = new BlockingCollection<ExampleNode> { root };
for (var i = 0; i < threads; i++)
{
var worker = new BackgroundWorker();
worker.DoWork += (_, _) =>
{
foreach (var exampleNode in processingBC.GetConsumingEnumerable())
{
procedure(exampleNode);//process node
if (Interlocked.Decrement(ref nodes) == 0)
processingBC.CompleteAdding();
foreach (var child in exampleNode.Children)
{
if (child.CheckSetProcessedMultithreaded())
{
continue;//skip node that has been already processed
}
processingBC.Add(child);
}
}
};
worker.RunWorkerCompleted += (_, _) =>
{
waitEvent.Signal();
worker.Dispose();
};
worker.RunWorkerAsync();
}
waitEvent.Wait();
}
Parallel Multithreaded Usage
(Under Parallel multithreaded means doing multiple concurrent iterations over the same structure, where each iteration runs using multithreading.)
Git repository has a parallel, multithreaded example of how to use it, but the benchmarks show it's only 5-10% faster than using ConcurrentDictionary
. ConcurrentDictionary way.
Parallel multithreaded way without additional collections, with bits manipulation and limitation of 64 parallel operations: ParallelStatePatternGraphTraversal.cs
NodeParallelUtils.cs (provides thread traversal context for tracking flag bits).
History
- 3rd December, 2023: Initial version