Click here to Skip to main content
15,894,955 members
Please Sign up or sign in to vote.
2.50/5 (2 votes)
See more:
Hi, I am getting the following string data from the database and they are label structure of Gmail Mailbox. They are paired with "/" operator. The string comes before "/" is parent and which comes after the "/" is a child.

parent/child
parent/parent2/child
parent/parent2/parent3/child
parent/parent2/parent3/parent4/child
parent/parent2/parent3/parent4/parent5/child..
...................................................

This above data is getting extracted dynamically and the names of "parent/child" could be anything, and they can have any number of pairs.

how to parse all the possible lines and build a tree structure ?

C#
foreach(string str in data.strs)
{

}
}
Posted
Comments
BillWoodruff 27-Feb-15 7:21am    
Hi, I will respond to this question in the next 18 hours.

Do you have any experience with writing recursive parsers ?
JBobby 27-Feb-15 7:56am    
Ok Bill. No, I don't have any, unfortunately.

One of the simplest way to define a tree structure is using already available list implementation, for example:
C#
    using NodeList = System.Collections.Generic.List<treenode>;

// ...

    class TreeNode {
        internal string Name { get; set; }
        internal TreeNode[] Children { get { return children; } } 
        NodeList children = new NodeList();
        internal void AddChild(string name) { /* isn't it obvious? */ }
        // ... and so on...
    } //class TreeNode</treenode>


During parsing, you need to walk the tree and find matching path, adding new children as nodes which haven't been added before.

Anyway, this is a really bad representation of tree in your input data. Are you sure it's given and you cannot change it into anything better? Did you get an idea on tree operations?

The property Name is shown just for example. Another step would be using a generic parameter:
C#
    using System.Collections.Generic;

//...

    class TreeNode<NodeInfo> {
        internal NodeInfo Info { get; set; }
        internal TreeNode<NodeInfo>[] Children { get { return children; } }
        List<TreeNode<NodeInfo>> children = new List<TreeNode<NodeInfo>>();
        internal void AddChild(NodeInfo name) { /* isn't it apparent */ }
        // ... and so on...
    } //class TreeNode


You may need to add tree operations to the node class, such as insertion, removals, search, and so on…

—SA
 
Share this answer
 
v2
Comments
JBobby 27-Feb-15 5:11am    
Yes, it's the same data representation I am retrieving from Google and According to my requirements, I need to know which is a parent and which is a child.
Sergey Alexandrovich Kryukov 27-Feb-15 8:46am    
I did not retrieve anything from Google...
What is still unclear? If you need some algorithm you cannot figure out, please explain what exactly is should do. Any node can be made parent or child or both. And you can add Parent property to this TreeNode class, to be able to move up the tree...
—SA
JBobby 27-Feb-15 9:34am    
No, I meant I retrieved the data from Google Gmail using its api. I was talking about the data representation I posted on my question. They are Gmail Custom Label Structure. Parent/Parent2/Parent3/child. Where Parent is the parent of Parent2, Parent2 is the parent of Parent3 and Parent3 is the parent of child. I need an algorithm to put this data into a tree structure so that I can figure out the relationship between the pair(parent/child). I am completely new to Tree Structure and C# collections so anything might be useful for me. Your class looks pretty good.
Sergey Alexandrovich Kryukov 27-Feb-15 9:58am    
I explained how to do it. I do understand the input data structure. You initial explanation was clear, thank you. I really appreciate it, as many other inquirers don't provide clear specifications.

Please, what are you missing now?
You parse the .../.../... chain in array using string.Split. Is it clear?
Then, in the resulting array, each element's index is the tree depth of some node, and the value is the node name (text). Is that clear?
Each input data line is the instance of the array. You add data line by line, each time create a new array. Is that clear?
On each input data line, you traverse the tree starting from the top element. First, you match name of the current node with the array element of the same depth. If this is the same, you match or add new element in children. Is that clear?
To traverse children you just use foreach(TreeNode child in Children) or you can use Array.IndexOf, to immediately get to the index of found (matching) child. Is that clear? If it returns -1, it means the matching child node at this level is not found, so you will add a node at this level. And so on, in a loop. Is that clear?

The performance will be far from optimal, but this is rather the consequence of ugly input data format, its huge redundancy (the same nodes are written again and again)...

—SA
JBobby 27-Feb-15 11:12am    
+5 for good explanation. :) thank you very much. :)
Keep in mind it only properly works if you're using an structure of levels that increases by one.


string[] data = new string[]
{
"level1/l1-child1",
"level1/l1-child2",
"level1/level2/l2-child1",
"level1/level2/l2-child2",
"level1/level2/level3/l3-child1",
"level1/level2/level3/l3-child2",
};

C#
TreeNode currentNode = null, nextNode = null;
int items_level = 0;
foreach(string dat in data)
{
    string[] items = dat.Split(new char[] { '/' });

    // same level as last string
    //
    if (items_level == items.Length)
    {
        currentNode.Nodes.Add(items[items_level - 1]);
        continue;
    }

    // special case for 1st time
    //
    if (currentNode == null)
    {
        nextNode = treeView1.Nodes.Add(items[0]);
        items_level = 1;
    }
    else
    {
        nextNode = currentNode.Nodes.Add(items[items_level - 1]);
    }

    nextNode.Nodes.Add(items[items_level]);

    items_level = items.Length;
    currentNode = nextNode;
}
 
Share this answer
 
Comments
JBobby 27-Feb-15 11:11am    
+5 for your answer. :) thank you very much.
To make this more interesting I'll use data that has one item that is not "in order:"
C#
private string data = @"parent1/child1
parent1/parent1a/child2
parent2/parent2a/parent2a1/parent2a1a/child4
parent1/parent1a/parent1a1/child3
parent2/parent2a/parent2a1/parent2a1a/child4
parent2/parent2a/parent2a1/parent2a1a/parent2a1a1/child5";
Note the assumption made here that the list of Google e-mail category folder-paths will never have two different nodes (within two different root nodes) that each have a sub-node with the same name; it is this constraint that allows non-recursive parsing by using a Dicationary<string, TreeNode>. And, please note: this code was written a few years ago; I welcome any suggestions for improvement.

In the example shown here the data is parsed to populate a WinForm TreeView:
C#
using System;
using System.Collections.Generic;
using System.Windows.Forms;

private Dictionary<string,> dctTokenToTreeNode = new Dictionary<string,>(); 
private TreeNode currentNode;
private TreeNode lastNode;

private string[] parsedLines;
private string[] tokens;
private string lastToken;

private char[] lineSplitChar = new[] { '\n', '\r' };
private char[] nodeSplitChar = new[] { '/' };

private void ParseData(TreeView TV, string data)
{
    string token;

    parsedLines = data.Split(lineSplitChar, StringSplitOptions.RemoveEmptyEntries);

    foreach (string line in parsedLines)
    {
        tokens = line.Split(nodeSplitChar, StringSplitOptions.RemoveEmptyEntries);

        for (int i = 0; i < tokens.Length; i++)
        {
            lastToken = token;

            token = tokens[i];

            if (dctTokenToTreeNode.ContainsKey(token)) continue;
            
            currentNode = new TreeNode(token);

            if (dctTokenToTreeNode.TryGetValue(lastToken, out lastNode))
            {
                lastNode.Nodes.Add(currentNode); 
            }
            else
            {
                TV.Nodes.Add(currentNode);
            }

            dctTokenToTreeNode.Add(token, currentNode);
        }
    }
}
 
Share this answer
 
v2

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900