Click here to Skip to main content
15,881,173 members
Articles / Programming Languages / C#

Introducing Nets

Rate me:
Please Sign up or sign in to vote.
3.76/5 (9 votes)
13 Sep 2007CPOL13 min read 57K   400   29   15
An introduction to the Nets data structure.

Screenshot - SimpleNetTestMain.png

Introduction

Think of a house, any house, and if you do, it will probably be a bit blank, a bit unfilled in. You may think of a shape, be it one story, two story, or more, but the details won't have been fleshed out. You may have pictured the windows, but not if they had curtains, or the colour or pattern of the curtains. This, in simple terms, is called modeling. We do it all the time we can imagine something in basic shapes and forms, and we know what it means. Sure, we can fill in the details later, but most of the time, we like to the think that the details are already there. That we know what they are, but we just didn't bother to expand on them because we know what a house is.

When we come to computers though, we see something different. Our whole perception of computers and the way they work is not based on models of the same variety. When it comes to computers, everything has to be fleshed out to the nth degree. They need to be exact, to know everything about everything; otherwise, they won't work, and the slightest not knowing of something is a bug, a fault in the program that needs correcting.

Sure, we have arrays, which are models of a sort, and hashtables and dictionaries which, if we are lucky, may allow us to link up two related pieces of data together. Each piece of data is still unique. Even when we look at relational databases, the data is still separated into chunks, with each chunk, say, a company, being linked to other data, on companies, by what it is the company does. While the data itself is possibly very complex, the data type, i.e., company, is still a relatively simple data type in that it is a single block of data that represents a single item.

Of course, for demonstration purposes, I over generalise and simplify, but the reason for this is to show the thinking behind the creation of Nets, which attempts to deal with complex data types. Or to be more precise, strings of data types, and of course, the perfect example for dealing with strings of data types is to deal with strings themselves, which is why the example provided shows how Nets deal with words and then sentences and then paragraphs. Of course, as with any data structure, it is up to the individual whether they have any use for it, and I wouldn't have thought of it unless I had a specific use in mind.

Anatomy of a Net

A Net gets its name from the diamond pattern style of fisherman's net, although this is not entirely accurate as we don't complete the diamond shape. A basic Net node looks like this:

Screenshot - BasicNetObject.png

It largely is an expansion on the tree data structure, with a pointer to the forward node, and nodes to the left and right that are based on object comparisons, with the node forward of the current node to the left or up in the picture, being less than the data in the forward node itself, and the data in the right nodes being greater than the data within the forward node. Of course, things are made more complicated by the fact that the left and right nodes are actually lists of nodes. So, without getting into code just yet, I'll try to explain the way it works.

If you have three strings "ABC", "ABB", and "ABF", and you add them to the Net, you will get a Net that looks like:

Screenshot - SimpleNetLetters.png

If you add them in the order listed above and if you were to add the "ABF" string first, then "C" and "B" would both be to the left or up on the picture. It should be added here that if you were to add the string "AB" at this point, it would disappear as the simple Net does not allow duplicate strings. The "AB" string would simply merge in with the other strings and be forgotten.

However, if you add two more strings "ABBE" and "ABAF", the Net will look like this:

Screenshot - SimpleNetAddedLetters.png

At this point, the original string "ABB" has been completely overwritten, and will no longer be returned when you print the contents of the Net.

As can be seen, the lists to the left and right (above and below in the image) are sorted automatically as the Net is built. This is now an integral part of how the Nets work (see the code discussion), but it is clear that the Nets would work faster without this, although the primary goal in the development of this project has been to get the Nets data structure working correctly rather than getting it to work fast.

Using the Demo

The best way to show exactly how the Net works is to run through the demonstration project and to use it with the aim of not only seeing how it works but understanding what can be achieved through using the Nets data structure.

The initial and main demonstration deals with text files. This is pretty rough and ready code, and the output depends pretty much entirely on the editing of the file. I mention this because included in the project is a text copy of Charles Darwin's Origin of the Species, which is used throughout the main part of the demonstration, and which does contain a few editing errors.

The Book Demo

The tab for the book demo looks like:

Screenshot - SimpleNetBooksDemoTab.png

There are three types to this demo: the words, sentences, and the paragraphs demo. As the names suggest, the idea is that you select a book and the Net is built up from either the words, the sentences, or the paragraphs contained in the book. It should be noted that the more complex the Net, the longer it will take the computer to build it, so the words demo will build relatively fast, while the sentences demo will take a while, and if you are going to run the paragraphs demo, then bring a book or don't surf the news pages until after you've started it.

As it is the quickest, the words demo is always set as the default on start up. The bottom half of the tab deals with the search options that we shall look at shortly. First, start up the demo by selecting the File button at the top:

Screenshot - SimpleNetFileLoad.png

and selecting a text file. The program will then read the file and build the Net. The output will look like:

Screenshot - SimpleNetWordsDemoRunOne.png

You'll have to scroll down for the more meaningful items, but as you can see:

Screenshot - SimpleNetWordsRunEnd.png

6512 items to choose from.

Searching

There are two types of search that can be run on the Net, and these are a search on data strings within the Net that start with a given string, which is on the left, and a search for a data string that contains the given items.

The search starting with is on the left of the tab. If we enter the letter o in the text box and press the left most search button:

Screenshot - SimpleNetSearchSO.png

we get all the individual words in the Net, starting with the letter o. There are 161. If we move to the right side of the tab and enter oo into the text box there, we can run the search for all the words containing the letters o and o:

Screenshot - SimpleNetSearchCOO.png

Here, we get the 482 words in the book that contain the values o and o. Note here that the default behaviour is to return any item that contains the values o and o, and if you want to run a search for all the items that contain the values o and o together, you need to specify this by clicking the sequential checkbox. There are 81.

The sentences and the paragraphs demos work in exactly the same way; they are just dealing with much more complicated data.

The Data Demo

The data demo is much simpler in construction in that it is simply provided to show the Nets working using a custom or more advanced data type:

Screenshot - SimpleNetTestDataDemoTab.png

The Insert data button builds a Net of random data using a class that has an integer and a string as members. These are then created in random length strings of data and inserted into the Net. You can click on the Insert Data button as many times as you feel like. The search options allow for the entering of a single integer value and text to go with it. The valid range is from 0 - 9 for the integer, and one of "up", "down", "left", "right", "forward", "backward" for the text. Note that they are all lower case, and that they don't have any intrinsic meaning within the program, they are simply strings.

If you click the Insert Data button a few times, you will get something that looks like this:

Screenshot - SimpleNetMovementInsert.png

And a search through the generated data for the value 5 and the text "forward" will give something like:

Search For String Containing 
5 forward
Print out for item 0
Value = 3 String = forward Value = 7 String = left Value = 4 String = right
    Value = 3 String = down Value = 8 String = down Value = 0 String = down 
Value = 5 String = forward Value = 5 String = up Value = 6 String = 
down Value = 3 String = backward Value = 1 String = down Value = 3 
String = right Value = 4 String = down Value = 6 String = right Value = 1 
String = up Value = 5 String = forward Value = 3 String = forward Value = 0 
String = down Value = 8 String = left 
Print out for item 1
Value = 6 String = backward Value = 4 String = down Value = 5 
String = forward 
Print out for item 2
Value = 6 String = up Value = 8 String = forward Value = 3 
String = right Value = 0 String = forward Value = 5 String = right Value = 5 
String = forward Value = 3 String = forward 
Print out for item 3
Value = 7 String = backward Value = 6 String = backward Value = 3 
String = forward Value = 4 String = right Value = 8 
    String = forward Value = 8
String = right Value = 4 String = backward Value = 3 String = down Value = 5 
String = forward Value = 4 String = left Value = 5 String = left Value = 5 
String = backward Value = 6 String = left Value = 0 String = backward 
    Value = 2 String = forward Value = 0 String = backward 
Print out for item 4
Value = 8 String = right Value = 6 String = down Value = 5 
String = left Value = 2 String = right Value = 9 String = forward Value = 3 
String = down Value = 6 String = forward Value = 8 String = up Value = 6 
String = down Value = 4 String = right Value = 3 String = down Value = 5 
String = forward

Note that the search mechanism here is for all the strings containing the value entered; so in this example, we get no values that start with the entered data.

Using the Net

The SimpleNet class is defined as:

C#
public class SimpleNet< T > where T : IComparer< T >

which is a simple generic class definition that uses a class that implements the IComparer interface. This class can be made up of any data that is required by the user, with the only requirement being that it must implement the IComparer Compare method. This is the sole requirement to using the SimpleNet class. So the class we use for the book example is defined as:

C#
public class StringData : IComparer< StringData >
{
    private string strData;

    public string Data
    {
        get
        {
            return strData;
        }
        set
        {
            strData = value;
        }
    }
          
     public StringData( string data )
    {
        strData = data;
    }

    public StringData()
    {
        strData = null;
    }
   
    public int Compare( StringData dataOne, StringData dataTwo )
    {
        return String.Compare( dataOne.Data.ToLower(), 
            dataTwo.Data.ToLower() );
    }

    public override string ToString()
    {
        return strData;
    }
}

As you can see, there is nothing special about this class at all, except the Compare function which calls the string Compare function, which probably calls the IComparable CompareTo function. This then is the technical complexity needed to use classes in SimpleNet. A Compare function that calculates if the data in the objects is equal, returning a zero if they are, or a -1 if the first object passed is smaller than the second object, and a 1 if the first object is greater than the second object.

As we are dealing with complex data strings or collections, the data must be entered into the Net as lists of data, and this is done in the mainform.cs file when we create the strings of data for the data demo with the code:

C#
List< DemoData > newList = new List< DemoData >();

if( demoNet == null )
demoNet = new SimpleNet< DemoData >();

int nNumberOfItems = rand.Random( 20 );
int nPiecesOfData = 0;

List< string > stringCol = new List< string >();
stringCol.Add( "up" );
stringCol.Add( "down" );
stringCol.Add( "left" );
stringCol.Add( "right" );
stringCol.Add( "forward" );
stringCol.Add( "backward" );

for( int i=0; i<nNumberOfItems; i++ )
{
    newList.Clear();
    nPiecesOfData = rand.Random( 20 );
    for( int j=0; j<nPiecesOfData; j++ )
    {
        newList.Add( new DemoData( rand.Random( 10 ), 
            stringCol[ rand.Random( stringCol.Count ) ] ) );
  }

  demoNet.InsertList( newList );
}

I use this example as it is very simple in that all we do is create a list, define a few variables, and then build it before adding to the Net with the InsertList function. It is also possible to use the in-built facilities of the class which contains a list that can hold the data before inserting it into the Net. This is done through the AddToList( T data ) function and then calling the InsertList() function, though you must remember, you should use this technique to call ClearList() before starting a new list.

To perform a search, you use the code:

C#
List< StringData > list = new List< StringData >();

for( int i=0; i<strWorking.Length; i++ )
{
    list.Add( new StringData( strWorking[ i ].ToString() ) );
}

List< List< StringData > > resultsList = 
    wordNet.FindListsStartingWith( list );

which builds a list to search for and then calls the search function. In this example, I am using the code to search for a word at the start of a string from the wordNet example in mainform.cs.

The SimpleNet Class

The SimpleNet class is 1400 lines long, so I will give more of an overview of its basic operations here and let those who are interested to go through the code. The class itself consists of 14 public functions; of these, 4 functions are for inserting data, these being the already mentioned AddToList, ClearList, and the overloaded InsertList. Two functions return lists of lists of data in the Net, these being the NetPrintOut function which returns the data as it gets it, and the GetSortedList which returns the same data but sorts it first. There are then two IsListPresent functions which return a boolean value for the given list. There are two overloads for FindListsStartingWith and five overloads for FindListsContaining.

To start with, all data in the SimpleNet is stored and manipulated as part of the SimpleNetNode class. This is your typical data type node that contains the lists and the points to the node in front and the node behind or the parent node. It is defined as:

C#
public class SimpleNetNode< T > where T : IComparer< T >

and contains the data:

C#
private SimpleNetNode< T > parentNode;
private SimpleNetNode< T > forwardNode;
private List< SimpleNetNode< T > > leftList;
private List< SimpleNetNode< T > > rightList;
private T data;
private bool bIsBase;
private bool bIsRightBase;
private bool bIsLeftBase;

As you can see, the first four variables point to either other nodes or lists of other nodes. The data variable is the class that is added to the node, and the Base variables are used as quick checks to see if the lists are used.

There is no reason for a user of the Net class to ever create a Net node object. In fact, this was originally private, but as we are using Generics, the compiler was never quite sure if there was an error or not. So, it was simpler and cleaner to have it as a separate class.

In the SimpleNet class, there are two conversion functions:

C#
private List< SimpleNetNode< T > > ConvertListToNodes( List< T > dataList )

which convert the input lists to a list of SimpleNetNodes and the functions:

C#
private List< T > ConvertList( List< SimpleNetNode< T > > list )

and

C#
private List< List< T > > ConvertList( 
    List< List< SimpleNetNode< T > > > list )

which convert the lists back before they are returned to the calling code.

Inserting Data

As you have seen, the lists of data are inserted using the InsertList function which is responsible for building the Net. This function starts by taking the list as a parameter:

C#
private void InsertList( List< SimpleNetNode< T > > insertionList )

and then cycles through the list, dealing with one item at a time:

C#
for( int i=0; i<insertionList.Count; i++ )
{
    SimpleNetNode< T > nnNewData = 
        new SimpleNetNode< T >( insertionList[ i ].Data );
}

If the list is empty, the data is made to be the root:

C#
if( TrueRoot.ForwardNode == null )
{
    TrueRoot.ForwardNode = nnNewData;
    CurrentNode = TrueRoot.ForwardNode;
    continue;
}

If there is already data in the list, we check the next item which is the forward of the current node. And if this is null, we add the data at this point. If at this point the CurrentNode forward is not null, we have to do a comparison and decide where to place the item, so the main block of the InsertList code looks like:

C#
nnTempNode = CurrentNode;
int nResult = nnNewData.Data.Compare( nnNewData.Data, 
    CurrentNode.ForwardNode.Data );
if( nResult == 0 )
{
    CurrentNode = CurrentNode.ForwardNode;
}
else if( nResult == -1 )
{
}
else
{
}

This is where data is overwritten, because if the data we are adding is the same as the data in the list, we do nothing with it, but simply move forward and go on to the next item in the list.

If the data is not equal, whether it be larger or greater, then we must continue checking it, so for demonstration purposes, we'll assume that the Compare function returned -1.

C#
bool bFound = false;
for( int n=0; n<CurrentNode.LeftListCount(); n++ )
{
    nnTempNode = CurrentNode.GetLeftNodeAt( n );

    if( nnTempNode.Data.Compare( nnTempNode.Data, nnNewData.Data ) == 0 )
    {
        CurrentNode = nnTempNode;
        bFound = true;
        break;
    }
}

if( bFound == false )
{
    SimpleNetNode< T > insertionNode = 
        new SimpleNetNode< T >( nnNewData.Data );
    insertionNode.ParentNode = CurrentNode;
    CurrentNode.AddLeftNode( insertionNode );
    nnSortNode = CurrentNode;
    CurrentNode.IsBase = true;
    CurrentNode.IsLeftBase = true;
    CurrentNode = insertionNode;
    bSortLeftList = true;
}

We cycle through all the elements in the chosen list, and if it is present, set the current pointer to the node in the list and continue from there and we move to the next item in the insert list. If the node is not present in the list, we add it to the list.

Searching

The main searching functions in the SimpleNet class are:

C#
public List< List< T > > FindListsStartingWith( List< T > dataList )

and

C#
public List< List< T > > FindListsStartingWith()

These two functions, as with their companion FindListsContaining functions, search through the SimpleNet and return the results as a list of lists for the data. This is done by converting the Net into a list of lists internally and then searching through the lists. There are other ways to do this, but with a data structure as complicated as a Net, it is easy to tie yourself into knots when trying to move through searching for specific items, so it was felt that this is a solution viable for a simple iteration of the Nets data structure.

Conclusion

This article is an introduction to the ideas and the mechanisms of the Net data structure, and as such, with the exception of the insertion code, the techniques, especially for the search, are of a more simplistic nature than they are in a full Nets structure. Despite this, it is hoped that this introduction has given an idea of the power that Nets can bring to a project.

References

  • Tod Golding (2005) Professional .NET 2.0 Generics
  • Nicholai M. Josuttis (1999) The C++ Standard Library
  • Tom Archer (2001) Inside C#, Microsoft Press
  • Jeffrey Richter (2002) Applied Microsoft .NET Framework Programming, Microsoft Press
  • Charles Peltzold (2002) Programming Microsoft Windows With C#, Microsoft Press
  • Robinson et al (2001) Professional C#, Wrox

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
United Kingdom United Kingdom
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
QuestionIs it me or do you need to re-format your article? Pin
Keith Vinson17-Sep-07 11:41
Keith Vinson17-Sep-07 11:41 
AnswerRe: Is it me or do you need to re-format your article? Pin
Jecc17-Sep-07 23:57
Jecc17-Sep-07 23:57 
AnswerRe: Is it me or do you need to re-format your article? Pin
pseudonym6718-Sep-07 0:32
pseudonym6718-Sep-07 0:32 
GeneralRe: Is it me or do you need to re-format your article? Pin
Keith Vinson18-Sep-07 4:06
Keith Vinson18-Sep-07 4:06 
GeneralRe: Is it me or do you need to re-format your article? Pin
pseudonym6718-Sep-07 6:43
pseudonym6718-Sep-07 6:43 
GeneralRe: Is it me or do you need to re-format your article? Pin
Keith Vinson18-Sep-07 10:12
Keith Vinson18-Sep-07 10:12 
GeneralRe: Is it me or do you need to re-format your article? Pin
pseudonym6718-Sep-07 23:21
pseudonym6718-Sep-07 23:21 
GeneralReminds me of Pile Pin
jongalloway16-Sep-07 10:49
jongalloway16-Sep-07 10:49 
GeneralRe: Reminds me of Pile Pin
pseudonym6717-Sep-07 0:45
pseudonym6717-Sep-07 0:45 
GeneralRe: Reminds me of Pile Pin
GerhardKreuzer18-Sep-07 8:42
GerhardKreuzer18-Sep-07 8:42 
GeneralRe: Reminds me of Pile Pin
pseudonym6718-Sep-07 9:05
pseudonym6718-Sep-07 9:05 
GeneralLittle explanation of probable usage Pin
rajantawate1(http//www.tawateventures.com13-Sep-07 10:59
rajantawate1(http//www.tawateventures.com13-Sep-07 10:59 
GeneralRe: Little explanation of probable usage Pin
pseudonym6713-Sep-07 13:04
pseudonym6713-Sep-07 13:04 
GeneralTrie Pin
Daniel Grunwald13-Sep-07 9:57
Daniel Grunwald13-Sep-07 9:57 
GeneralRe: Trie Pin
pseudonym6713-Sep-07 10:31
pseudonym6713-Sep-07 10:31 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.