Introduction
In this article, I present a class library which provides you a container indexed on more than one key of different types. This class library is inspired by Boost.MultiIndex Container by Joaquín M López Muñoz.
Background
Some of us have had a need for associative containers having more than one index, sometimes we also need associative containers with non-unique keys. Boost.MultiIndex provides a sophisticated library of container having more than one indexing mechanism. A need for a similar data structure in C# (.NET) and introduction of generics in .NET pushed me to design multi-index container similar concept.
Using the Code
Let's say we have an Employee class which is defined as:
public class Employee
{
SSN m_ssn; int m_id;
string m_firstName;
string m_lastName;
public Employee(int id, SSN ssn, string firstName, string lastName)
{
m_ssn = ssn;
m_id = id;
m_firstName = firstName;
m_lastName = lastName;
}
public SSN E_SSN
{
get { return m_ssn; }
set { m_ssn = value; }
}
public int E_ID
{
get { return m_id; }
set { m_id = value; }
}
public string LastName
{
get { return m_lastName; }
set { m_lastName = value; }
}
public string FirstName
{
get { return m_firstName; }
set { m_firstName = value; }
}
public override string ToString()
{
return "[" + m_id + "] " + m_firstName + " " +
m_lastName + " " + m_ssn.ToString();
}
}
Let's define a MultiIndex container for holding Employee objects which would be indexed by the following:
E_ID - Employee ID (unique)
E_SSN - Social Security (again unique)
FirstName - Non-unique first name
LastName - Non-unique last name
As you would have noticed from the keys, MultiIndexContainer works on "Properties". Since E_ID and E_SSN are unique keys and FirstName and LastName are non-unique keys, we would have to define them differently in our container. The following Index types would be used:
MultiIndexContainer.Index.UniqueKey
MultiIndexContainer.Index.NonUniqueKey
The above Index types have generic interfaces requiring you to define: type of element and type of key. You also have an option to associate a Tag of type MultiIndexContainer.ITag interface. Using tags would allow you to access an Index in a much more meaningful way than accessing using an occurrence/position based index. These Index types would be a part of an Index collection (MultiIndexContainer.Index.IndexedBy). The following will define a MultiIndexContainer instance for Employee:
MIContainer<
Employee,
IndexedBy<
Employee,
UniqueKey<
Employee,
int,
IDTag
>,
UniqueKey<
Employee,
SSN,
SSNTag
>,
NonUniqueKey<
Employee,
string,
FirstNameTag
>,
NonUniqueKey<
Employee,
string,
LastNameTag
>
>
> container = new MIContainer<Employee, IndexedBy<Employee,
UniqueKey<Employee, int, IDTag>, UniqueKey<Employee,
SSN, SSNTag>, NonUniqueKey<Employee, string,
FirstNameTag>, NonUniqueKey<Employee, string, LastNameTag>>>
(new IndexedBy<Employee, UniqueKey<Employee, int, IDTag>,
UniqueKey<Employee, SSN, SSNTag>, NonUniqueKey<Employee,
string, FirstNameTag>, NonUniqueKey<Employee,
string, LastNameTag>>(
new UniqueKey<Employee, int, IDTag>("E_ID"),
new UniqueKey<Employee, SSN, SSNTag>("E_SSN"),
new NonUniqueKey<Employee, string, FirstNameTag>("FirstName"),
new NonUniqueKey<Employee, string, LastNameTag>("LastName")
)
);
The C'tor looks pretty nasty up there ;) and in future I would like to make it nicer than the current version. As you would notice, I have associated tags on all the Indices. Tags are defined as:
public struct SSNTag : ITag { }
public struct IDTag : ITag { }
public struct FirstNameTag : ITag { }
public struct LastNameTag : ITag { }
MultiIndexContainer has the following interface:
public interface IMIContainer<ElemT>
{
void Add(ElemT elemt);
void Remove(Enumerators.IEnumerator<ElemT> iterator);
Index.IIndexedBy<ElemT> Indices { get; }
void Clear();
Enumerators.IEnumerator<ElemT> GetEnumerator();
void Foreach(Action<ElemT> action);
void Modify<KeyT>(IEnumerator<ElemT> keyEnumerator,
KeyModifier<ElemT> keyModifier);
}
Adding Elements
container.Add(
new Employee(
111, new SSN("121", "22", "3232"), "Parag", "Gadkari" )
);
container.Add(
new Employee(
111, new SSN("922", "52", "0000"), "Parag", "Gadkari" )
);
container.Add(
new Employee(
151, new SSN("121", "22", "3232"), "Parag", "Gadkari" )
);
container.Add(
new Employee(
1091, new SSN("999", "52", "0000"), "Parag", "Gadkari" )
);
Accessing Indices
All the indices derive from MultiIndexContainer.Index.IIndex<ElemT> interface. The following example shows how to access Indices (example of index set on LastName):
IIndex<Employee> index = container.Indices.Get<LastNameTag>();
IIndex<Employee> index = container.Indices[typeof(LastNameTag)];
IIndex<Employee> index = container.Indices.Get(typeof(LastNameTag));
IIndex<Employee> index = container.Indices[3];
IIndex<Employee> index = container.Indices.Get(3);
The following describes the IIndex<ElemT> interface:
public interface IIndex<ElemT>:Internals.IEnumerable<ElemT>
{
void Foreach(Action<ElemT> action);
void Sort();
Type Tag { get;}
}
For more functionalities (such as Find), the interface has to be cast to a more precise form of IIndex<KeyT,ElemT>. I really wanted to keep the need of typecasting away; but lack of *typedef* (and generics are not templates) forced the usage of casting.
Sort/Find/Foreach Etc.
IIndex<Employee> lastNameIndex = container.Indices.Get<LastNameTag>();
lastNameIndex.Sort();
lastNameIndex.Foreach(delegate(Employee empl)
{ Console.WriteLine(empl.ToString()); });
IIndex<string,Employee> exLastNameIndex = (IIndex<string,Employee>)lastNameIndex;
exLastNameIndex.Sort(delegate(string key1, string key2)
{ return key2.CompareTo(key1); });
IIndex<SSN, Employee> indexSSN =
(IIndex<SSN, Employee>)container.Indices.Get<SSNTag>();
MultiIndexContainer.Enumerators.IEnumerator<Employee> enm =
indexSSN.Find(new SSN("121", "22", "3232"));
while (enm.MoveNext())
{
Console.WriteLine(enm.Current.ToString());
container.Modify<SSN>(enm, this.SSNModifier);
}
Modify/Update a Key
Key can be modified in using the following:
IIndex<KeyT, ElemT>.Modify(KeyT oldKey, KeyT newKey)
IIndex<KeyT, ElemT>.Update(KeyT key)
IMIContainer<ElemT>.Modify<KeyT>(IEnumerator<ElemT> keyEnumerator, KeyModifier<ElemT> keyModifier)
The difference between Modify & Update methods of IIndex<KeyT,ElemT> is the way key is modified. Update modifies the key based on the property of ElemT, whereas Modify modifies based on the key passed to it. There are some rules while defining or updating a key:
- As of now, key *cannot* be
null.
- In case the key belongs to a
UniqueKey Index type, then the Key shouldn't exist in the Container.
MultiIndexContainer.Enumerators.IEnumerator<Employee> emp =
fnameIndex.Find("Parag");
while (emp.MoveNext())
{
emp.Current.FirstName = "Pragaa";
}
fnameIndex.ModifyKey("Parag", "Pragaa");
fnameIndex.UpdateKey("Parag");
MultiIndexContainer.Enumerators.IEnumerator<Employee> enm =
indexSSN.Find(new SSN("121", "22", "3232"));
while (enm.MoveNext())
{
container.Modify<SSN>(enm, this.SSNModifier);
}
Erasing an Element
MultiIndexContainer.Enumerators.IEnumerator<Employee> emp =
fnameIndex.Find("Parag");
while (emp.MoveNext())
{
container.Remove(emp);
}
Future Work
This is what I can think of (listed in the order of priority):
- Use of associative containers (that can be sorted on keys) or RBT instead of list of tuples for improving performance
- Remove redundant type definition from the MI
Container C'tor
- Copy C'tor
- Persistence & Archiving
- Extending support to
public fields, methods, key extractors
- Allowing nullable keys
- Database support
Your comments on the work would really be appreciated and would allow me to make this library better. Thanks.:)
History
- 20th May, 2007: Initial post