<html dir="LTR"><head><META http-equiv="Content-Type" content="text/html; charset=utf-8"><meta name="vs_targetSchema" content="http://schemas.microsoft.com/intellisense/ie5"><title>Iesi.Collections</title><xml></xml><link rel="stylesheet" type="text/css" href="MSDN.css"></head><body id="bodyID" class="dtBODY"><div id="nsbanner"><div id="bannerrow1"><table class="bannerparthead" cellspacing="0"><tr id="hdr"><td class="runninghead">"Set" Collections Library for .NET</td><td class="product"></td></tr></table></div><div id="TitleRow"><h1 class="dtH1">Iesi.Collections Namespace</h1></div></div><div id="nstext"><p>
The System.Collections namespace in the .NET Framework provides a number of collection
types that are extremely useful for manipulating data in memory. However, there is one
type of collection that is conspicuously missing from <code>System.Collections</code>: the
<code>Set</code>.
</p><p>
A <code>Set</code> is a collection that contains no duplicate elements, and where the order of
the elements may be arbitrary. It is loosely modelled after
the mathematical concept of a "set." This implementation is based on the Java <code>Set</code>
interface definition, so if you are also a Java programmer, this may seem familiar.
This library provides a number of "standard" <code>Set</code> operators that the Java library
neglected to include.
</p><p><code>Sets</code> come in handy when an <code>Array</code> or a <code>List</code> won't quite fit the bill.
Arrays in .NET have a fixed length, making it tedious to add and remove elements.
Lists allow you add new objects easily, but you can have numerous duplicate
elements, which is undesirable for some types of problems.
Searching Arrays or Lists for elements is just plain slow for large data sets,
requiring a linear search. You could keep the array sorted and use a binary search,
but that is often more trouble than it is worth (especially since this library,
and the .NET Framework, provide better ways that are already written for you).
</p><p>
With sets, adding elements, removing elements, and checking for the existence
of an element is fast and simple. You can mix and match the elements in different
sets using the supported mathematical set operators: union, intersection,
exclusive-or, and minus.
</p><p>
You will see some interesting side effects with different <code>Set</code>
implementations in this library, depending on the underlying search algorithm.
For example, if you choose a sort-based <code>Set</code>, the elements will come out
in sort order when you iterate using <code>foreach</code>. If you use a hash-based
<code>Set</code>, the elements will come out in no particular order, but checking
for inclusion will fastest when dealing with large data sets. If you use a
list-based <code>Set</code>, elements will come out in the order you put them in
when you iterate (although the effect of operators on element order in
<code>Set</code> instances is not well defined by design). Additionally, list-based
sets are fastest for very small data sets (up to about 10 elements),
but get slower very quickly as the number of contained elements increases.
To get the best of both worlds, the library provides a <code>Set</code> type that
uses lists for small data sets and switches to a hash-based algorithm when
the data set gets large enough to warrant it.
</p><p>
The following sample program demonstrates some of the features of sets:
<pre class="code">
using System;
using Iesi.Collections;
namespace RiverDemo
{
class Rivers
{
[STAThread]
static void Main(string[] args)
{
//Use Arrays (which are ICollection objects) to quickly initialize.
Set arizona
= new SortedSet(new string[] {"Colorado River"});
Set california
= new SortedSet(new string[] {"Colorado River", "Sacramento River"});
Set colorado
= new SortedSet(new string[] {"Arkansas River", "Colorado River", "Green River", "Rio Grande"});
Set kansas
= new SortedSet(new string[] {"Arkansas River", "Missouri River"});
Set nevada
= new SortedSet(new string[] {"Colorado River"});
Set newMexico
= new SortedSet(new string[] {"Rio Grande"});
Set utah
= new SortedSet(new string[] {"Colorado River", "Green River", "San Juan River"});
//Rivers by region.
Set southWest = colorado | newMexico | arizona | utah;
Set midWest = kansas;
Set west = california | nevada;
//All rivers (at least for the demo).
Set all = southWest | midWest | west;
Print("All rivers:", all);
Print("Rivers in the southwest:", southWest);
Print("Rivers in the west:", west);
Print("Rivers in the midwest:", midWest);
Console.WriteLine();
//Use the '-' operator to subtract the rivers in Colorado from
//the set of all rivers.
Print("Of all rivers, these don't pass through Colorado:", all - colorado);
//Use the '&' operator to find rivers that are in Colorado AND in Utah.
//A river must be present in both states, not just one.
Print("Rivers common to both Colorado and Utah:", colorado & utah);
//use the '^' operator to find rivers that are in Colorado OR Utah,
//but not in both.
Print("Rivers in Colorado and Utah that are not shared by both states:",
colorado ^ utah);
//Use the '&' operator to discover which rivers are present in Arizona,
// California,Colorado, Nevada, and Utah. The river must be present in
// all states to be counted.
Print("Rivers common to Arizona, California, Colorado, Nevada, and Utah:",
arizona & california & colorado & nevada & utah);
//Just to prove to you that operators always return like types, let's do a
//complex Set operation and look at the type of the result:
Console.WriteLine("The type of this complex operation is: " +
((southWest ^ colorado & california) | kansas).GetType().FullName);
}
private static void Print(string title, Set elements)
{
Console.WriteLine(title);
foreach(object o in elements)
{
Console.WriteLine("\t" + o);
Console.WriteLine();
}
}
}
</pre></p><p>
Although there are other kinds of sets available in the library, the example uses
<code>SortedSet</code> throughout. This is nice for the example, since everything will
print neatly in alphabetical order. But you may be wondering what kind of <code>Set</code>
is returned when you "union," "intersect," "exclusive-or," or "minus" two <code>Set</code>
instances. The library always returns a <code>Set</code> that is the same type as
the <code>Set</code> on the left, unless the left operand is null, in which case it
returns the type of the <code>Set</code> on the right.
</p><p>
Here is the output from running the example:
<pre class="code">
All rivers:
Arkansas River
Colorado River
Green River
Missouri River
Rio Grande
Sacramento River
San Juan River
Rivers in the southwest:
Arkansas River
Colorado River
Green River
Rio Grande
San Juan River
Rivers in the west:
Colorado River
Sacramento River
Rivers in the midwest:
Arkansas River
Missouri River
Of all rivers, these don't pass through Colorado:
Missouri River
Sacramento River
San Juan River
Rivers common to both Colorado and Utah:
Colorado River
Green River
Rivers in Colorado and Utah that are not shared by both states:
Arkansas River
Rio Grande
San Juan River
Rivers common to Arizona, California, Colorado, Nevada, and Utah:
Colorado River
The type of this complex operation is:
Iesi.Collections.SortedSet
Press any key to continue
</pre></p><p><a href="Iesi.CollectionsHierarchy.html">Namespace hierarchy</a></p><h3 class="dtH3">Classes</h3><div class="tablediv"><table class="dtTABLE" cellspacing="0"><tr valign="top"><th width="50%">Class</th><th width="50%">Description</th></tr><tr valign="top"><td width="50%"><a href="Iesi.Collections.DictionarySet.html">DictionarySet</a></td><td width="50%"><p><code>DictionarySet</code> is an abstract class that supports the creation of new <code>Set</code>
types where the underlying data store is an <code>IDictionary</code> instance.</p><p>You can use any object that implements the <code>IDictionary</code> interface to hold set data.
You can define your own, or you can use one of the objects provided in the Framework.
The type of <code>IDictionary</code> you choose will affect both the performance and the behavior
of the <code>Set</code> using it. </p><p>This object overrides the <code>Equals()</code> object method, but not the <code>GetHashCode()</code>, because
the <code>DictionarySet</code> class is mutable. Therefore, it is not safe to use as a key value in a hash table.</p><p>To make a <code>Set</code> typed based on your own <code>IDictionary</code>, simply derive a
new class with a constructor that takes no parameters. Some <code>Set</code> implmentations
cannot be defined with a default constructor. If this is the case for your class,
you will need to override <code>Clone()</code> as well.</p><p>It is also standard practice that at least one of your constructors takes an <code>ICollection</code> or
an <code>ISet</code> as an argument.</p></td></tr><tr valign="top"><td width="50%"><a href="Iesi.Collections.HashedSet.html">HashedSet</a></td><td width="50%">
Implements a <code>Set</code> based on a hash table. This will give the best lookup, add, and remove
performance for very large data-sets, but iteration will occur in no particular order.
</td></tr><tr valign="top"><td width="50%"><a href="Iesi.Collections.HybridSet.html">HybridSet</a></td><td width="50%">
Implements a <code>Set</code> that automatically changes from a list to a hash table
when the size reaches a certain threshold. This is good if you are unsure about
whether you data-set will be tiny or huge. Because this uses a dual implementation,
iteration order is not guaranteed!
</td></tr><tr valign="top"><td width="50%"><a href="Iesi.Collections.ImmutableSet.html">ImmutableSet</a></td><td width="50%"><p>Implements an immutable (read-only) <code>Set</code> wrapper.</p><p>Although this is advertised as immutable, it really isn't. Anyone with access to the
<code>basisSet</code> can still change the data-set. So <code>GetHashCode()</code> is not implemented
for this <code>Set</code>, as is the case for all <code>Set</code> implementations in this library.
This design decision was based on the efficiency of not having to <code>Clone()</code> the
<code>basisSet</code> every time you wrap a mutable <code>Set</code>.</p></td></tr><tr valign="top"><td width="50%"><a href="Iesi.Collections.ListSet.html">ListSet</a></td><td width="50%">
Implements a <code>Set</code> based on a list. Performance is much better for very small lists
than either <code>HashedSet</code> or <code>SortedSet</code>. However, performance degrades rapidly as
the data-set gets bigger. Use a <code>HybridSet</code> instead if you are not sure your data-set
will always remain very small. Iteration produces elements in the order they were added.
However, element order is not guaranteed to be maintained by the various <code>Set</code>
mathematical operators.
</td></tr><tr valign="top"><td width="50%"><a href="Iesi.Collections.Set.html">Set</a></td><td width="50%"><p>A collection that contains no duplicate elements. This class models the mathematical
<code>Set</code> abstraction, and is the base class for all other <code>Set</code> implementations.
The order of elements in a set is dependant on (a)the data-structure implementation, and
(b)the implementation of the various <code>Set</code> methods, and thus is not guaranteed.</p><p><code>Set</code> overrides the <code>Equals()</code> method to test for "equivalency": whether the
two sets contain the same elements. The "==" and "!=" operators are not overridden by
design, since it is often desirable to compare object references for equality.</p><p>Also, the <code>GetHashCode()</code> method is not implemented on any of the set implementations, since none
of them are truely immutable. This is by design, and it is the way almost all collections in
the .NET framework function. So as a general rule, don't store collection objects inside <code>Set</code>
instances. You would typically want to use a keyed <code>IDictionary</code> instead.</p><p>None of the <code>Set</code> implementations in this library are guranteed to be thread-safe
in any way unless wrapped in a <code>SynchronizedSet</code>.</p><p>The following table summarizes the binary operators that are supported by the <code>Set</code> class.</p><div class="tablediv"><table class="dtTABLE" cellspacing="0"><tr valign="top"><th width="50%">Operation</th><th width="50%">Description</th><th width="50%">Method</th><th width="50%">Operator</th></tr><tr valign="top"><td>Union (OR)</td><td>Element included in result if it exists in either <code>A</code> OR <code>B</code>.</td><td><code>Union()</code></td><td><code>|</code></td></tr><tr valign="top"><td>Intersection (AND)</td><td>Element included in result if it exists in both <code>A</code> AND <code>B</code>.</td><td><code>InterSect()</code></td><td><code>&</code></td></tr><tr valign="top"><td>Exclusive Or (XOR)</td><td>Element included in result if it exists in one, but not both, of <code>A</code> and <code>B</code>.</td><td><code>ExclusiveOr()</code></td><td><code>^</code></td></tr><tr valign="top"><td>Minus (n/a)</td><td>Take all the elements in <code>A</code>. Now, if any of them exist in <code>B</code>, remove
them. Note that unlike the other operators, <code>A - B</code> is not the same as <code>B - A</code>.</td><td><code>Minus()</code></td><td><code>-</code></td></tr></table></div></td></tr><tr valign="top"><td width="50%"><a href="Iesi.Collections.SortedSet.html">SortedSet</a></td><td width="50%">
Implements a <code>Set</code> based on a sorted tree. This gives good performance for operations on very
large data-sets, though not as good - asymptotically - as a <code>HashedSet</code>. However, iteration
occurs in order. Elements that you put into this type of collection must implement <code>IComparable</code>,
and they must actually be comparable. You can't mix <code>string</code> and <code>int</code> values, for example.
</td></tr><tr valign="top"><td width="50%"><a href="Iesi.Collections.SynchronizedSet.html">SynchronizedSet</a></td><td width="50%"><p>Implements a thread-safe <code>Set</code> wrapper. The implementation is extremely conservative,
serializing critical sections to prevent possible deadlocks, and locking on everything.
The one exception is for enumeration, which is inherently not thread-safe. For this, you
have to <code>lock</code> the <code>SyncRoot</code> object for the duration of the enumeration.</p></td></tr></table></div><h3 class="dtH3">Interfaces</h3><div class="tablediv"><table class="dtTABLE" cellspacing="0"><tr valign="top"><th width="50%">Interface</th><th width="50%">Description</th></tr><tr valign="top"><td width="50%"><a href="Iesi.Collections.ISet.html">ISet</a></td><td width="50%"><p>A collection that contains no duplicate elements. This interface models the mathematical
<code>Set</code> abstraction.
The order of elements in a set is dependant on (a)the data-structure implementation, and
(b)the implementation of the various <code>Set</code> methods, and thus is not guaranteed.</p><p><code>Set</code> overrides the <code>Equals()</code> method to test for "equivalency": whether the
two sets contain the same elements. The "==" and "!=" operators are not overridden by
design, since it is often desirable to compare object references for equality.</p><p>Also, the <code>GetHashCode()</code> method is not implemented on any of the set implementations, since none
of them are truely immutable. This is by design, and it is the way almost all collections in
the .NET framework function. So as a general rule, don't store collection objects inside <code>Set</code>
instances. You would typically want to use a keyed <code>IDictionary</code> instead.</p><p>None of the <code>Set</code> implementations in this library are guranteed to be thread-safe
in any way unless wrapped in a <code>SynchronizedSet</code>.</p><p>The following table summarizes the binary operators that are supported by the <code>Set</code> class.</p><div class="tablediv"><table class="dtTABLE" cellspacing="0"><tr valign="top"><th width="50%">Operation</th><th width="50%">Description</th><th width="50%">Method</th></tr><tr valign="top"><td>Union (OR)</td><td>Element included in result if it exists in either <code>A</code> OR <code>B</code>.</td><td><code>Union()</code></td></tr><tr valign="top"><td>Intersection (AND)</td><td>Element included in result if it exists in both <code>A</code> AND <code>B</code>.</td><td><code>InterSect()</code></td></tr><tr valign="top"><td>Exclusive Or (XOR)</td><td>Element included in result if it exists in one, but not both, of <code>A</code> and <code>B</code>.</td><td><code>ExclusiveOr()</code></td></tr><tr valign="top"><td>Minus (n/a)</td><td>Take all the elements in <code>A</code>. Now, if any of them exist in <code>B</code>, remove
them. Note that unlike the other operators, <code>A - B</code> is not the same as <code>B - A</code>.</td><td><code>Minus()</code></td></tr></table></div></td></tr></table></div></div></body></html>