Click here to Skip to main content
15,888,802 members
Articles / Programming Languages / C#
Article

Sorted Set and MultiSet with Embedded Keys

Rate me:
Please Sign up or sign in to vote.
4.00/5 (7 votes)
16 Feb 2008CPOL4 min read 21.3K   258   14  
AVL Search tree implementation with the objects accessed by an embedded key

Introduction

There are three collection objects defined in this project, all of them based on AVL trees. This article does not go into the AVL algorithms - there are plenty of articles already on that, and in fact I used the Goletas Collections code for that part of the implementation. This article concentrates on using Keys embedded in the objects as the sort order, rather than having separate keys as required by the SortedList and SortedDictionary objects in the .NET Framework.

Background

When I first discovered the Goletas Collections, I thought "Great!" However, when I started trying to use them in real-world applications, I soon discovered their limitations.

Firstly, when I tried to hook them up as a data source for list controls, I realized that they had their own set of Collection interfaces, and it just did not work. Therefore, the first class in this project, the SortedSet, is simply an implementation of the Goletas code that extends System.Collections.Generic.ICollection<T> rather than Goletas.Collection.ICollection<T>.

The second issue arose when I went looking for one of my objects in the tree. For example, I have a collection of Movie objects, all sorted by Title. These objects all contain additional properties for Rating, Comments, Category, Collections of Actors, etc. as well as the MovieID from the database. It would be nice if I could just find the Movie object for "Star Wars"...

Enter the Goletas SortedDictionary. Fine, except that the Key is already a part of the object, and as I started creating multiple SortedDictionaries for different key values, my code started becoming quite messy. It would be much nicer if the SortedSet could just extract the key value from the object, and be able to do its search comparisons on the IComparable<K> rather than the IComparable<T>.

Simple enough... I defined the IKeyedType<K> interface that retrieves an IComparable<K> Key from the objects. Then I modified the AVL code in the SortedSet so that the comparisons were done on this key rather than on the object itself, added a couple of methods to access the objects by key (an Indexer and TryGetValue method) and away I went.

Finally, I hit one more problem - I have a few different movies with the same title. I could change my key to include the ID property, but then when I went looking for a movie, I needed to know the ID as well. Not good enough. I thought about a SortedDictionary of IList<T> objects - not too bad, but still requires the key and the object to be separated, and I didn't like that. I then thought about having a SortedSet of ICollection<T> objects, but maintaining the collections in the application code was even messier than the separate key-value pairs. And enumerating the movies...well Yuck!

So I decided to build a KeyedMultiSet that maintains its own member collection of the objects. Its enumerator enumerates the member collections as it does the MoveNext, and my application code is again neat.

Using the Code

The SortedSet object is mainly used to be able to efficiently enumerate sorted collections of objects. In order to achieve this, the objects must have an idea of how to sort themselves. They do this by implementing the IComparable<> interface. Providing this is implemented, the SortedSet works just like any other ICollection object.

C#
class MyObject : IComparable<MyObject> {
    public string Title;
    public long ID;

    ...

    // Implement IComparable<MyObject>
    public int CompareTo(MyObject other) {
        int c = Title.CompareTo(other.Title);
        // Allow for duplicate titles by comparing on the ID value if needed.
        if (c == 0)
            c = ID.CompareTo(other.ID);
        return c;
    }
}

class MyAppClass {

    ...

    SortedSet<MyObject> TheCollection;
    ...

    // Lists the objects in order
    foreach(MyObject o in TheCollection)
        Console.WriteLine("(0}: {1}", o.ID, o.Title);

The KeyedSet class requires an IComparable key to be exposed by the object. This can either be one of the .NET native types, or an application defined type that implements the IComparable<K> interface.

C#
struct MyKey : IComparable<MyKey&gt {
    public string Title;
    public long ID;

    public MyKey(string Title, long ID) {
        this.Title = Title;
        this.ID = ID;
    }

    // Implement IComparable<MyKey>
    public int CompareTo(MyObject other) {
        int c = Title.CompareTo(other.Title);
        // Allow for duplicate titles by comparing on the ID value if needed.
        if (c == 0)
            c = ID.CompareTo(other.ID);
        return c;
    }
}

// Allow sets of MyObject sorted on MyKey, or just the string title.
class MyObject : IKeyedType<MyKey>, IKeyedType<string> {
    public string Title;
    public long ID;
    ...
    // Implement IKeyedType<MyKey>
    MyKey IKeyedType<MyKey>.Key {
        get {
            return new MyKey(Title, ID);
        }
    }
    // Implement IKeyedType<string>
    string IKeyedType<string>.Key {
        get {
            return Title;
        }
    }
}

class MyAppClass {

    KeyedSet<string, MyObject> TitleCollection;
    KeyedSet<MyKey, MyObject> RecordCollection;
    MyObject anInstance;
    ...

    // Add a record to the collection

    TitleCollection.Add(anInstance);    // Note that this will throw an
                                        // ArgumentException if object with the
                                        // same title already exists in the collection

    RecordCollection.Add(anInstance);   // This is keyed by title and ID,
                                        // so duplicate titles
                                        // will not throw an exception

    // Find an object by Title
    MyObject found = TitleCollection["A title"];

    // If I don't know it already exists??
    if (!TryGetValue("A title", out found))
        MessageBox.Show("Not Found!");

    ...
}

The KeyedMultiSet is the trickiest to use. The collection does not directly contain the actual objects, instead it is a collection of collections of the objects. The enumerator handles moving through the member collection, but when accessing an object in the collection, you need to remember that the returned object is also a collection.

I have declared the type of collection to use internally as one of the generic parameters. Mostly, a list is probably good enough, but you may also want to sort the sub-collections, so passing it a KeyedSet is not out of the question. The example below shows this being done on names, so the enumerated list will contain all of the names sorted on surname, then firstname.

C#
struct Surname : IComparable<Surname> {
    public string name;
    public Surname(string Name) {
        name = Name;
    }
    public int CompareTo(Surname other) {
        return name.CompareTo(other.name);
    }
}

struct FirstName : IComparable<FirstName> {
    public string name;
    public FirstName(string Name) {
        name = Name;
    }
    public int CompareTo(FirstName other) {
        return name.CompareTo(other.name);
    }
}

struct FullName : IKeyedType<Surname>, IKeyedType<FirstName> {
    public Surname surname;
    public FirstName firstName;

    public FullName(string Surname, string FirstName) {
        surname = new Surname(Surname);
        firstName = new FirstName(FirstName);
    }

    // Implement the IKeyedType<K> interfaces
    FirstName IKeyedType<FirstName>.Key {
        get { return firstName; }
    }
    Surname IKeyedType<Surname>.Key {
        get { return surname; }
    }

    // Return the person's name
    public string Name {
        get { return string.Concat(surname.name, ", ", firstName.name); }
    }
}

class SortedNames {
    KeyedMultiSet<Surname, FullName, KeyedSet<FirstName, FullName>> names;

    public SortedNames() {
        names = new KeyedMultiSet<surname, />>();
    }

    public void AddName(FullName Name) {
        names.Add(Name);
    }
    public void AddName(string Surname, string FirstName) {
        names.Add(new FullName(Surname,FirstName));
    }

    public FullName FindName(string Surname, string FirstName) {
        return names[new Surname(Surname)][new FirstName(FirstName)];
    }

    public ICollection<FullName> FindFamily(string Surname) {
        return names[new Surname(Surname)];
    }

    public void ListNames() {
        foreach (FullName name in names)
            Console.WriteLine(name.Name);
    }
} 

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer
Australia Australia
Been programming for 40 years now, starting when I was 13 on DEC PDP 11 (back in the day of paper tape storage, and hex switch boot procedures). Got right into micro-computers from an early age, with machines like the Dick Smith Sorcerer and the CompuColor II. Started CP/M and MS-DOS programming in the mid 1980's. By the end of the '80's, I was just starting to get a good grip on OOP (Had Zortech C++ V1.0).

Got into ATL and COM programming early 2002. As a result, my gutter vocabulary has expanded, but it certainly keeps me off the streets.

Recently, I have had to stop working full time as a programmer due to permanent brain damage as a result of a tumour (I just can't keep up the pace required to meet KPI's). I still like to keep my hand in it, though, and will probably post more articles here as I discover various tricky things.

Comments and Discussions

 
-- There are no messages in this forum --