Click here to Skip to main content
6,596,602 members and growing! (18,565 online)
Email Password   helpLost your password?
General Programming » Algorithms & Recipes » General     Intermediate

Introducing Nets

By pseudonym67

An introduction to the Nets data structure
C# 2.0, Windows, .NET 2.0VS2005, Dev
Posted:13 Sep 2007
Views:14,474
Bookmarked:26 times
Unedited contribution
Announcements
Loading...
 
Search    
Advanced Search
Add to IE Search
printPrint   add Share
      Discuss Discuss   Broken Article?Report  
6 votes for this article.
Popularity: 2.82 Rating: 3.63 out of 5
1 vote, 16.7%
1
1 vote, 16.7%
2

3
2 votes, 33.3%
4
2 votes, 33.3%
5
Screenshot - SimpleNetTestMain.png

Introduction

Think of a house any house and if you do it will probably be a bit blank, a bit unfilled in. You may think of a shape be it one story, two story, or more but the details wont have been fleshed out, you may have pictured the windows but not if they had curtains or the colour or pattern of the curtains. This in simple terms is called modelling, we do it all the time we can imagine something in basic shapes and forms and we know what it means, sure we can fill in the details later, but most of the time we like to the think that the details are already there. That we know what they are but we just didn't bother to expand on them because we know what a house is.

When we come to computers though we see something different our whole perception of computers and the way they work is not based on models of the same variety. When it comes to computers everything has to be fleshed out to the nth degree. They need to be exact, to know everything about everything otherwise they wont work and the slightest not knowing of something is a bug, a fault in the program that needs correcting.

Sure we have arrays which are models of a sort and hashtables and dictionaries which if we are lucky may allow us to link up two related pieces of data together. Each piece of data is still unique. Even when we look at relational databases the data is still separated into chunks with each chunk say a company being linked to the other data on companies by what it is the company does. While the data itself is possibly very complex the data type i.e. a company is still a relatively simple data type in that it is a single block of data that represents a single item.

Of course for demonstration purposes I over generalise and simplify but the reason for this is to show the thinking behind the creation of Nets which attempt to deal with complex datatypes. Or to be more precise strings of data types and of course the perfect example for dealing with strings of data types is to deal with strings themselves which is why the example provided shows how Nets deal with words and then sentences and then paragraphs. Of course as with any data structure it is up to the individual whether they have any use for it and I wouldn't have thought of it unless I had a specific use in mind.

Anatomy of a Net

A Net gets it's name from the diamond pattern style of fisherman's net although this is not entirely accurate as we don't complete the diamond shape, a basic Net node looks like this,

Screenshot - BasicNetObject.png

It largely is an expansion on the tree data structure with a pointer to the forward node and nodes to the left and right that are based on object comparisons with the node forward of the current node to the left or up in the picture being less that the data in the forward node itself and the data in the right nodes, being greater than the data within the forward node. Of course things are made more complicated by the fact that the left and right nodes are actually lists of nodes. So without getting into code just yet I'll try to explain the way it works.

If you have three strings "ABC", "ABB" and ABF" and you add them to the net you will get a net that looks like,

Screenshot - SimpleNetLetters.png

If you add them in the order listed above. If you were to add the "ABF" string first then the "C" and the "B" would both be to the left or up on the picture. It should be added here that if you were to add the string "AB" at this point it would disappear as the simple net does not allow duplicate strings, the "AB" string would simply merge in with the other strings and be forgotten.

However if you add two more strings "ABBE" and "ABAF" the net will look like this,

Screenshot - SimpleNetAddedLetters.png

At this point the original string "ABB" has been completely overwritten and will no longer be returned when you print the contents of the net.

As can be seen from the lists to the left and right ( above and below in the image ) are sorted automatically as the net is built. This is now an integral part of how the nets work ( see the code discussion ) but it is clear that the nets would work faster without this, although the primary goal in the development of this project has been to get the nets data structure working correctly rather than getting it to work fast.

Using the Demo

The best way to show exactly how the net works is to run through the demonstration project and how to use it with the aim to show not only how it works but to give an understanding of what can be achieved through using the nets data structure.

The initial and main demonstration deals with text files this is pretty rough and ready code and the output depends pretty much entirely on the editing of the file. I mention this because included in the project is a text copy of Charles Darwin's Origin Of The Species which is used throughout the main part of the demonstration and which does contain a few editing errors.

The Book Demo

The tab for the book demo looks like,

Screenshot - SimpleNetBooksDemoTab.png

There are three types to this demo the words, sentences and the paragraphs demo, as the names suggest the idea is that you select a book and the net is built up from either the words, the sentences or the paragraphs contained in the book. It should be noted that the more complex the net then the longer it will take the computer to build it so the words demo will build relatively fast, while the sentences demo will take a while and if you are going to run the paragraphs demo, then bring a book or don't surf the news pages until after you've started it.

As it is the quickest the words demo is always set as the default on start up. The bottom half of the tab deals with the search options that we shall look at shortly firstly start up the demo by selecting the file button at the top,

Screenshot - SimpleNetFileLoad.png

and selecting a text file. The program will then read the file and build the net. The output will look like,

Screenshot - SimpleNetWordsDemoRunOne.png

You'll have to scroll down for the more meaningful items but as you can see,

Screenshot - SimpleNetWordsRunEnd.png

6512 items to choose from.

Searching

There are two types of search that can be run on the net and these are a search on data strings within the net that start with a given string which is on the left and a search for a data string that contains the given items.

The search starting with is on the left of the tab. if we enter the letter o in the text box and press the left most search button,

Screenshot - SimpleNetSearchSO.png

we get all the individual words in the net starting with the letter o. There are 161. If we move to the right side of the tab and enter oo into the text box there we can run the search for all the words containing the letters o and o,

Screenshot - SimpleNetSearchCOO.png

Here we get the 482 words in the book that contain the values o and o. Note here that the default behaviour is to return any items that contain the values o and o and if you want to run a search for all the items that contain the values o and o together you need to specify this by clicking the sequential checkbox. There are 81.

The sentences and the paragraphs demos work in exactly the same way they are just dealling with much more complicated data.

The Data Demo

The data demo is much simpler in construction in that it is simply provided to show the nets working using a custom or more advanced data type,

Screenshot - SimpleNetTestDataDemoTab.png

The insert data button builds a net of random data using a class that has an integer and a string as members these are then created in random length strings of data and inserted into the net. You can click on the Insert Data button as many times as you feel like. The search options allow for the entering of a single integer value and text to go with it. The valid range is from 0 - 9 for the integer and one of "up", "down", "left", "right", "forward", "backward" for the text. Note that they are all lower case and that they don;t have any intrinsic meaning within the program they are simply strings.

If you click the Insert Data button a few times you will get something that looks like this,

Screenshot - SimpleNetMovementInsert.png

And a search through the generated data for the value 5 and the text "forward" will give something like:

Search For String Containing 
5 forward
Print out for item 0
Value = 3 String = forward Value = 7 String = left Value = 4 String = right
    Value = 3 String = down Value = 8 String = down Value = 0 String = down 
Value = 5 String = forward Value = 5 String = up Value = 6 String = 
down Value = 3 String = backward Value = 1 String = down Value = 3 
String = right Value = 4 String = down Value = 6 String = right Value = 1 
String = up Value = 5 String = forward Value = 3 String = forward Value = 0 
String = down Value = 8 String = left 
Print out for item 1
Value = 6 String = backward Value = 4 String = down Value = 5 
String = forward 
Print out for item 2
Value = 6 String = up Value = 8 String = forward Value = 3 
String = right Value = 0 String = forward Value = 5 String = right Value = 5 
String = forward Value = 3 String = forward 
Print out for item 3
Value = 7 String = backward Value = 6 String = backward Value = 3 
String = forward Value = 4 String = right Value = 8 
    String = forward Value = 8
String = right Value = 4 String = backward Value = 3 String = down Value = 5 
String = forward Value = 4 String = left Value = 5 String = left Value = 5 
String = backward Value = 6 String = left Value = 0 String = backward 
    Value = 2 String = forward Value = 0 String = backward 
Print out for item 4
Value = 8 String = right Value = 6 String = down Value = 5 
String = left Value = 2 String = right Value = 9 String = forward Value = 3 
String = down Value = 6 String = forward Value = 8 String = up Value = 6 
String = down Value = 4 String = right Value = 3 String = down Value = 5 
String = forward

Note that the search mechanism here is for all the strings containing the value entered, so in this example we get no values that start with the entered data.

Using the Net

The SimpleNet class is defined as,

public class SimpleNet< T > where T : IComparer< T >

which is a simple generic class definition that uses a class that implements the IComparer interface. This class can be made up of any data that is required by the user with the only provision being that it must implement the IComparer Compare method. This is the sole requirement to using the SimpleNet class. So the class we use for the book example is defined as,

public class StringData : IComparer< StringData >
{
    private string strData;

    public string Data
    {
        get
        {
            return strData;
        }
        set
        {
            strData = value;
        }
    }
          
     public StringData( string data )
    {
        strData = data;
    }

    public StringData()
    {
        strData = null;
    }
   
    public int Compare( StringData dataOne, StringData dataTwo )
    {
        return String.Compare( dataOne.Data.ToLower(), 
            dataTwo.Data.ToLower() );
    }

    public override string ToString()
    {
        return strData;
    }
}

As you can see there is nothing special about this class at all except the Compare function which calls the string Compare function which probably calls the IComparable CompareTo Function. This then is the technical complexity needed to use classes in the SimpleNet. A Compare function that calculates if the data in the objects is equal returning a zero if they are or a -1 if the first object passed is smaller than the second object and a 1 if the first object is greater than the second object.

As we are dealing with complex data strings or collections the data must be entered into the net as lists of data and this is done in the mainform.cs file when we create the strings of data for the data demo with the code,

List< DemoData > newList = new List< DemoData >();

if( demoNet == null )
demoNet = new SimpleNet< DemoData >();

int nNumberOfItems = rand.Random( 20 );
int nPiecesOfData = 0;

List< string > stringCol = new List< string >();
stringCol.Add( "up" );
stringCol.Add( "down" );
stringCol.Add( "left" );
stringCol.Add( "right" );
stringCol.Add( "forward" );
stringCol.Add( "backward" );

for( int i=0; i<nNumberOfItems; i++ )
{
    newList.Clear();
    nPiecesOfData = rand.Random( 20 );
    for( int j=0; j<nPiecesOfData; j++ )
    {
        newList.Add( new DemoData( rand.Random( 10 ), 
            stringCol[ rand.Random( stringCol.Count ) ] ) );
  }

  demoNet.InsertList( newList );
}

I use this example as it is very simple in that all we do is create a list define a few variables and then build it before adding to the Net with the InsertList function. It also possible to use the inbuilt facilities of the class which contains a list that can hold the data before inserting it into the Net. This is done through the AddToList( T data ) function and then calling the InsertList() function though you must remember if you use this technique to call ClearList() before starting a new list.

To perform a search you use the code,

List< StringData > list = new List< StringData >();

for( int i=0; i<strWorking.Length; i++ )
{
    list.Add( new StringData( strWorking[ i ].ToString() ) );
}

List< List< StringData > > resultsList = 
    wordNet.FindListsStartingWith( list );

which builds a list to search for and then calls the search function, in this example I am using the code to search for a word at the start of a string from the wordNet example in mainform.cs.

The SimpleNet Class

The SimpleNet class is 1400 lines long so I will give more of an overview of it's basic operations here and let those who are interested go through the code. The class itself consists of 14 public functions of these, 4 functions are for inserting data, these being the already mentioned AddToList, ClearList and the overloaded InsertList. Two functions return lists of lists of the data in the Net these being the NetPrintOut function which returns the data as it gets it and the GetSortedList which returns the same data but sorts it first. There are then 2 IsListPresent functions which returns a boolean value for the given list. There are 2 overloads for the FindListsStartingWith and 5 overloads for FindListsContaining.

To start with all data in the SimpleNet is stored and manipulated as part of the SimpleNetNode class. This is your typical data type node that contains the lists and the points to the node in front and the node behind or the parent node. It is defined as,

public class SimpleNetNode< T > where T : IComparer< T >

and contains the data,

private SimpleNetNode< T > parentNode;
private SimpleNetNode< T > forwardNode;
private List< SimpleNetNode< T > > leftList;
private List< SimpleNetNode< T > > rightList;
private T data;
private bool bIsBase;
private bool bIsRightBase;
private bool bIsLeftBase;

as you can see the first four variables point to either other nodes or lists of other nodes. The data variable is the class that is added to the node and the Base variables are used to as quick checks to see if the lists are used.

There is no reason as user of the Net class to ever create a Net node object. In fact this was originally private but as we are using generics the compiler was never quite sure if there was an error or not. So it was simpler and cleaner to have it as a separate class.

In the SimpleNet class there are two conversion functions,

private List< SimpleNetNode< T > > ConvertListToNodes( List< T > dataList )

which converts the input lists to a list of SimpleNetNodes and the functions,

private List< T > ConvertList( List< SimpleNetNode< T > > list )

and

private List< List< T > > ConvertList( 
    List< List< SimpleNetNode< T > > > list )

which convert the lists back before they are returned to the calling code.

Inserting Data

As you have seen the lists of data are inserted using the InsertList function which is responsible for building the Net. This function starts by taking the list as a parameter,

private void InsertList( List< SimpleNetNode< T > > insertionList )

and then cycles through the list dealing with one item at a time,

for( int i=0; i<insertionList.Count; i++ )
{
    SimpleNetNode< T > nnNewData = 
        new SimpleNetNode< T >( insertionList[ i ].Data );
}

If the list is empty the data is made to be the root,

if( TrueRoot.ForwardNode == null )
{
    TrueRoot.ForwardNode = nnNewData;
    CurrentNode = TrueRoot.ForwardNode;
    continue;
}

If there is already data in the list we check the next item which is the forward of the current node. and if this is null we add the data at this point. If at this point the CurrentNode forward is not null we have to do a comparison and decide where to place the item, so the main block of the InsertList code looks like,

nnTempNode = CurrentNode;
int nResult = nnNewData.Data.Compare( nnNewData.Data, 
    CurrentNode.ForwardNode.Data );
if( nResult == 0 )
{
    CurrentNode = CurrentNode.ForwardNode;
}
else if( nResult == -1 )
{
}
else
{
}

This is where data is overwritten because if the data we are adding is the same as the data in the list we do nothing with it but simply move forward and go on to the next item in the list.

If the data is Not equal whether it be larger or greater then we must continue checking it so for demonstration purposes we'll assume that the Compare function returned -1.

bool bFound = false;
for( int n=0; n<CurrentNode.LeftListCount(); n++ )
{
    nnTempNode = CurrentNode.GetLeftNodeAt( n );
                                                   
    if( nnTempNode.Data.Compare( nnTempNode.Data, nnNewData.Data ) == 0 )
    {
        CurrentNode = nnTempNode;
        bFound = true;
        break;
    }
}
                                                
if( bFound == false )
{
    SimpleNetNode< T > insertionNode = 
        new SimpleNetNode< T >( nnNewData.Data );
    insertionNode.ParentNode = CurrentNode;
    CurrentNode.AddLeftNode( insertionNode );
    nnSortNode = CurrentNode;
    CurrentNode.IsBase = true;
    CurrentNode.IsLeftBase = true;
    CurrentNode = insertionNode;
    bSortLeftList = true;
}

We cycle through all the elements in the chosen list and if it is present set the current pointer to the node in the list and continue from there when we move to the next item in the insert list. If the node is not present in the list we add it to the list.

Searching

The main searching functions in the SimpleNet class are,

public List< List< T > > FindListsStartingWith( List< T > dataList )

and

public List< List< T > > FindListsStartingWith()

These two functions as with there companion FindListsContaining functions search through the SimpleNet and return the results as a list of lists for the data. This is done by converting the Net into a List of Lists internally and then searching through the lists. There are other ways to do this but with a data structure as complicated as a Net it is easy to tie yourself into knots when trying to move through searching for specific items so it was felt that this is a solution viable for a simple iteration of the Nets data structure.

Conclusion

This article is an introduction to the ideas and the mechanisms of the Net data structure and as such with the exception of the insertion code the techniques especially for the search are of a more simplistic nature than they are in a full Nets structure. Despite this it is hoped that this introduction has given an idea of the power that Nets can bring to a project.

References

  • Tod Golding ( 2005 ) Professional .Net 2.0 Generics

  • Nicholai M. Josuttis ( 1999 ) The C++ Standard Library

  • Tom Archer ( 2001 ) Inside C#, Microsoft Press

  • Jeffrey Richter ( 2002 ) Applied Microsoft .NET Framework Programming, Microsoft Press

  • Charles Peltzold ( 2002 ) Programming Microsoft Windows With C#, Microsoft Press

  • Robinson et al ( 2001 ) Professional C#, Wrox

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

About the Author

pseudonym67


Member

Location: United Kingdom United Kingdom

Other popular Algorithms & Recipes articles:

Article Top
You must Sign In to use this message board.
FAQ FAQ 
 
Noise Tolerance  Layout  Per page   
 Msgs 1 to 15 of 15 (Total in Forum: 15) (Refresh)FirstPrevNext
GeneralIs it me or do you need to re-format your article? PinmemberKeith Vinson12:41 17 Sep '07  
GeneralRe: Is it me or do you need to re-format your article? PinmemberJecc0:57 18 Sep '07  
GeneralRe: Is it me or do you need to re-format your article? Pinmemberpseudonym671:32 18 Sep '07  
GeneralRe: Is it me or do you need to re-format your article? PinmemberKeith Vinson5:06 18 Sep '07  
GeneralRe: Is it me or do you need to re-format your article? Pinmemberpseudonym677:43 18 Sep '07  
GeneralRe: Is it me or do you need to re-format your article? PinmemberKeith Vinson11:12 18 Sep '07  
GeneralRe: Is it me or do you need to re-format your article? Pinmemberpseudonym670:21 19 Sep '07  
GeneralReminds me of Pile Pinmemberjongalloway11:49 16 Sep '07  
GeneralRe: Reminds me of Pile Pinmemberpseudonym671:45 17 Sep '07  
GeneralRe: Reminds me of Pile PinmemberGerhardKreuzer9:42 18 Sep '07  
GeneralRe: Reminds me of Pile Pinmemberpseudonym6710:05 18 Sep '07  
GeneralLittle explanation of probable usage Pinmemberrajantawate1(http//www.jhatak.com)11:59 13 Sep '07  
GeneralRe: Little explanation of probable usage Pinmemberpseudonym6714:04 13 Sep '07  
GeneralTrie PinmemberDaniel Grunwald10:57 13 Sep '07  
GeneralRe: Trie Pinmemberpseudonym6711:31 13 Sep '07  

General General    News News    Question Question    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

PermaLink | Privacy | Terms of Use
Last Updated: 13 Sep 2007
Editor:
Copyright 2007 by pseudonym67
Everything else Copyright © CodeProject, 1999-2009
Web22 | Advertise on the Code Project