Click here to Skip to main content
15,125,746 members
Articles / General Programming / Algorithms
Article
Posted 2 Jan 2011

Stats

28.3K views
288 downloads
15 bookmarked

Solving LCA by Reducing to RMQ in C#

Rate me:
Please Sign up or sign in to vote.
4.70/5 (12 votes)
3 Jan 2011CPOL3 min read
Finds the least common ancestor in a tree

Introduction

I spent a bit of time yesterday looking for an algorithm to find the least common ancestor in a given tree. The LCA is the first node shared by two child nodes.

lca1.png

In the picture above, the least common ancestor for 3 and 6 is 1. The node 0 is also a common ancestor, but not the least common.

Least common ancestor has many applications for social networks, computer networks, common subexpression elimination in compilers, etc. My requirement is to find the common ancestor in an expression tree. I could not find anything on the internet in C#. I decided to write one based off of work by Robert Tarjan, Omer Berkman, and Uzi Vishkin.

A node in my graph has a value and a set of child nodes. We do not have a reference to the parent node. The only way to "identify" a node in this graph is by reference.

C#
/// <summary>
/// A tree node.
/// </summary>
/// <typeparam name="T">The type of the value associated with this node.</typeparam>
public interface ITreeNode<T>
{
    /// <summary>
    /// Gets or sets the value.
    /// </summary>
    /// <value>The value.</value>
    T Value { get; set; }

    /// <summary>
    /// Gets the children.
    /// </summary>
    /// <value>The children.</value>
    IEnumerable<ITreeNode<T>> Children { get; }
}
A node in the example tree.

Using the Code

We will create a helper class that has a single method. This method takes two nodes and returns their most common parent.

C#
IGraphNode<T> FindCommonParent(ITreeNode<T> x, ITreeNode<T> y) 

The algorithm by Berkman and Vishkin is very straight forward. We do a bit of preprocessing first. We visit each node using Eulerian path and store this in an array. Eulerian path can be found using depth first transversal. We start at the root and record the path from the root to the first child. If that child also has a child node, then we record the path to that child all the way until we get to the bottom of the graph. Once we get to the bottom, we work our way up recording the path from the child back to the parent. If the parent has another child, we perform the same process until we have visited every node in the graph and made our way back up to the root.

lca4.png

The picture above shows the Euler path and the resulting array.
C#
private void PreProcess()
{
    // Eulerian path visit of graph 
    Stack<ProcessingState> lastNodeStack = new Stack<ProcessingState>();
    ProcessingState current = new ProcessingState(_rootNode);
    ITreeNode<T> next;
    lastNodeStack.Push(current);

    NodeIndex nodeIndex;
    int valueIndex;
    while (lastNodeStack.Count != 0)
    {
        current = lastNodeStack.Pop();
        if (!_indexLookup.TryGetValue(current.Value, out nodeIndex))
        {
            valueIndex = _nodes.Count;
            _nodes.Add(current.Value);
            _indexLookup[current.Value] = new NodeIndex(_values.Count, valueIndex);
        }
        else
        {
            valueIndex = nodeIndex.LookupIndex;
        }
        _values.Add(valueIndex);

        // If there is a next then push the current value on to the stack along with 
        // the current value.
        if (current.Next(out next))
        {
            lastNodeStack.Push(current);
            lastNodeStack.Push(new ProcessingState(next));
        }
    }
    _nodes.TrimExcess();
    _values.TrimExcess();
}
The code used to create the Euler path array.

After the preprocessing is finished, we can start calculating the LCA by using Range Minimum Query. Range Minimum Query works exactly as it sounds. For a given range in an array, we need to find the minimum value. Our range is defined by the first time a node is visited. We do a hash table lookup to get this information for the given nodes followed by a simple FOR loop to find the minimum value in the given range. Not including the hashtable lookups, we can say the worst case time is O(n) where n is the number of nodes in the graph.

lca3.png

In the given range, 1 is the minimum value and the least common ancestor. The range is defined by where the two nodes are first encountered.
C#
public ITreeNode<T> FindCommonParent(ITreeNode<T> x, ITreeNode<T> y)
        {
            // Find the first time the nodes were visited during preprocessing.
            NodeIndex nodeIndex;
            int indexX, indexY;
            if (!_indexLookup.TryGetValue(x, out nodeIndex))
            {
                throw new ArgumentException("The x node was not found in the graph.");
            }
            indexX = nodeIndex.FirstVisit;
            if (!_indexLookup.TryGetValue(y, out nodeIndex))
            {
                throw new ArgumentException("The y node was not found in the graph.");
            }
            indexY = nodeIndex.FirstVisit;

            // Adjust so X is less than Y
            int temp;
            if (indexY < indexX)
            {
                temp = indexX;
                indexX = indexY;
                indexY = temp;
            }

            // Find the lowest value.
            temp = int.MaxValue;
            for (int i = indexX; i < indexY; i++)
            {
                if (_values[i] < temp)
                {
                    temp = _values[i];
                }
            }
            return _nodes[temp];
        } 

Points of Interest

There is one small issue with this solution. If we are calculating the LCA for nodes X, Y and Y is descendant of X, then this solution will return X. For my requirement, this is not an issue but in some cases may want to return the immediate parent of X instead.

This was my first crack at this and I am not an algorithm expert. There are plenty of more optimal solutions for this problem. This solution has very little code and very simple to understand. Please let me know if you have any feedback.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Ziad Elmalki
United States United States
No Biography provided

Comments and Discussions

 
QuestionSimpler solutions Pin
bwulfe21-Mar-12 6:58
Memberbwulfe21-Mar-12 6:58 
GeneralMy vote of 5 Pin
Josheco7-Jan-11 8:48
MemberJosheco7-Jan-11 8:48 
GeneralMy vote of 4 Pin
Member 5279784-Jan-11 12:22
MemberMember 5279784-Jan-11 12:22 
GeneralMy vote of 5 Pin
felmalki4-Jan-11 12:06
Memberfelmalki4-Jan-11 12:06 
GeneralMy vote of 5 Pin
imgen2-Jan-11 23:48
Memberimgen2-Jan-11 23:48 
GeneralRe: My vote of 5 Pin
Andrew Rissing10-Jan-11 6:00
MemberAndrew Rissing10-Jan-11 6:00 
I have to second imgen's comment as well.

In your algorithm, you are traversing the entire tree to perform the pre-processing. I can only assume you would do this once and store the pre-processing off until the tree is modified. Even then, it would seem more efficient and straight forward to just use the algorithm put forth by imgen, especially if you expect to only perform this search for two nodes at a time and your tree is not small.

As a second alternative, you may consider using a modified pre-order tree[^] to store your data. The algorithm presented there requires certain assumptions to hold true, but its a fairly handy algorithm to be aware of.
GeneralRe: My vote of 5 Pin
Ziad Elmalki10-Jan-11 16:03
MemberZiad Elmalki10-Jan-11 16:03 
GeneralRe: My vote of 5 Pin
Andrew Rissing11-Jan-11 5:11
MemberAndrew Rissing11-Jan-11 5:11 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.