Introduction
If you work with .NET and commercial applications, you probably have many objects to represent your business entities, like Customers, Employees, Products and so on.
When retrieving data from a data store (like a database, for example), we usually ask for the data in a specific order, so that the underlying engine can do the work to order it for us, and then we populate our objects in the correct order.
There are cases however, that for some reason we may need to rearrange objects in memory, either because we can't afford to go back to the data store (web services can be very slow for example) or because its simpler than changing an existing library to do this.
Sorting items by property in .NET is relatively easy, but doing a "sql-like" sort involving multiple properties in different directions is not a trivial task.
As I always get people asking me how to do it, I dediced to write this article to give an insight about the sort methods available in .NET and introduce you to my classes where you can sort objects by properties using a sql-like statement, using a code like this:
EmployeeCollection employees = GetEmployeeCollection()
employees.Sort("IsActive DESC, Name")
Although people using the .NET Framework 3.5 should be able to do this easily using LINQ, I won't cover LINQ or lambda expressions. I will focus on how to implement this using .NET 2.0 only, which is still widely used. Some people may find it useful for some scenarios on 3.5 though.
In this article, we will work with the "Employee" class as described below:
Background
Comparers
Before going into details on the actual sorting, it is important to understand a basic part of most sorting algorithms: the element comparison. (Note: there are sorting algorithms that are not comparison based, but they are more complex and used in specific cases only. For most situations, comparison sorts will work fine.)
So, it doesn't matter what sort algorithm you choose to use, somewhere in your code you will always need to know if an element should go before or after another element in the list. That functionality is provided in .NET by the IComparer interface.
The IComparer interface is a contract that requires the implementation of the "Compare" method. This method accepts 2 objects and returns 0 if they are equal, a negative number if the first is lower than the second and a positive number if the first is higher than the second.
There is a default implementation of the System.Collections.IComparer interface, the Comparer.Default, that will work with almost any type you will want to compare (like strings, numbers, dates, etc).
In .NET 2.0, microsoft introduced the generic System.Collections.Generic.IComparer<T> and the System.Collections.Generic.Comparison<T> delegate. The IComparer<T> is exactly like the IComparer interface, but is strongly typed. The Comparison<T> delegate is like the IComparer<T>.Compare method, and is handy when you don't want or need to create an extra class. There is also a default implementation of the IComparer<T> implementation, the Comparer<T>.Default.
Array.Sort
The .NET framework provides a way to sort arrays using the Array.Sort method. This method uses the QuickSort algorithm to sort arrays of any type. You can easily sort an array of Employees by the "Name" property by creating a method to compare their names and then using this method as a delegate of type Comparison<T> like this:
private int CompareEmployeeNames(Customer a, Customer b)
{
return String.Compare(a.Name, b.Name);
}
and use it like this:
Employee[] employees = GetEmployeeArray();
Array.Sort(employees, CompareEmployeeNames);
Since String.Compare already returns the right numeric values for string comparisons, we don't need to do anything else.
You can also use the List<T>.Sort in the same way to sort lists. It will internally use the Array.Sort method.
Sorting by multiple columns
Sort by multiple "columns" is very easy. Imagine you want to sort a collection of our Employee objects by active status (actives first), then by name and then by ID. The idea is to simply apply the order in the reverse order.
The steps to accomplish the sort above would be:
- Sort by ID in ascending order
- Sort by Name in ascending order
- Sort by IsActive in descending order.
So, you must think, if we just apply the sorts in this order using Array.Sort or List<T>.Sort, we are done. Well, not exactly.
The Problem: Array.Sort is Unstable
Array.Sort works fine if you need to sort only by one property or column, and if that is what you need, you should stop reading here. But if you need to order by more columns like you do on SQL (order by field1, field2, field3), you will have a surprise. The problem with Array.Sort is that it is unstable. At this point I need to say: DON'T PANIC! That doesn't mean your program will crash if you use it.
Stability is property of sorting algorithms. An unstable sort algorithm will change the order of the items in the list even if they are already ordered. Stable algorithms in contrast won't change the order of items if they are ordered.
See what happens when we apply Array.Sort to the IsActive column in a list that is already sorted by name:
Since all items have IsActive = True, they are already sorted, and you wouldn't expect the sort to do anything. But remember that Array.Sort uses an unstable algorithm, and unstable algorithms may change the order of sorted items if that makes them run faster.
Imagine we were doing the 3 steps discussed previously. The second sort would change the order of the first, and the third would change the second. That makes Array.Sort pretty useless to sort multi-column data.
The Solution
Implementing the Merge Sort
The first thing to solve here is the sort algorithm. As previously discussed, the Array.Sort uses QuickSort, which is unstable. Since there is no alternative in .NET, we must implement or our own stable sort algorithm.
I decided to implement the Merge Sort, which is stable and has a reasonable speed. You can implement your own sort or use an external library of your choice to do this. But for the sake of simplicity, this implementation is good enough.
Here is a sample of the MergeSort that I implemented:
public sealed class MergeSort<T>
{
private IComparer<T> _comparer;
private bool _isAscending;
private IList<T> _list;
private T[] _buffer;
private MergeSort
(IList<T> list, IComparer<T> comparer, ListSortDirection direction)
{
_list = list;
_buffer = new T[list.Count];
_comparer = comparer;
_isAscending = (direction == ListSortDirection.Ascending);
}
public static void Sort(IList<T> list, IComparer<T> comparer,
ListSortDirection direction)
{
MergeSort<T> sort = new MergeSort<T>(list, comparer, direction);
sort._MergeSort(0, list.Count - 1);
}
private int Compare(T x, T y)
{
if (_isAscending)
return _comparer.Compare(x, y);
else
return _comparer.Compare(y, x);
}
private void _MergeSort(int firstIndex, int lastIndex)
{
int lastRelativeIndex = lastIndex - firstIndex;
if (lastRelativeIndex < 1)
return;
int middle = (lastRelativeIndex / 2) + firstIndex;
int postMiddle = middle + 1;
_MergeSort(firstIndex, middle);
_MergeSort(postMiddle, lastIndex);
Merge(firstIndex, middle, postMiddle, lastIndex);
}
private void Merge(int leftStart, int leftEnd, int rightStart, int rightEnd)
{
int bufferIndex = leftStart;
int leftIndex = leftStart;
int rightIndex = rightStart;
while (leftIndex <= leftEnd && rightIndex <= rightEnd)
{
if (Compare(_list[leftIndex], _list[rightIndex]) > 0)
_buffer[bufferIndex++] = _list[rightIndex++];
else
_buffer[bufferIndex++] = _list[leftIndex++];
}
for (int i = leftIndex; i <= leftEnd; i++)
_buffer[bufferIndex++] = _list[i];
for (int i = rightIndex; i <= rightEnd; i++)
_buffer[bufferIndex++] = _list[i];
for (int i = leftStart; i <= rightEnd; i++)
_list[i] = _buffer[i];
}
}
I won't cover in details how merge sort works, and you can check the details here if you want: http://en.wikipedia.org/wiki/Merge_sort.
Creating a PropertyComparer
Now, in order to sort by properties, we need a way to compare the objects by property name and know which one should go first or last. Instead of creating delegates for every case, I created the PropertyComparer, that given a type and a property name, can compare the values of those properties using reflection.
This class will find the property by name, extract the "get" method and call it to get the value.
This is a sample of how this works:
PropertyInfo pi = t.GetProperty
(propertyName, BindingFlags.Public | BindingFlags.NonPublic |
BindingFlags.Instance | BindingFlags.IgnoreCase);
MethodInfo mi = pi.GetGetMethod();
object value = mi.Invoke(instance, null);
This will enable us to sort by comparing the property values. You can also use this Comparer with Array.Sort if you like.
Parsing the Sort expression
We already have the right sort algorithm and a way to compare the property values. Now, we can just create the comparers and call the MergeSort in the right direction.
PropertyComparer comparer =
new PropertyComparer(typeof(Employee), "Name");
SortHelper.MergeSort<employee>(employees, comparer, ListSortDirection.Ascending);
We could just create a bunch of comparers and call the MergeSort method multiple times. But instead, it is a better idea to create some code to parse a sort expression and do what we need. Remember, we want to use an expression like:
IsActive DESC, Name ASC, ID ASC
This is the last and easier part. This is the code for the sort parser:
public class CollectionSort<T>
{
public static void Sort(IList<T> list, string sortExpression)
{
SortInfo[] sorts = BuildSorts(sortExpression);
for (int i = sorts.Length - 1; i >= 0; i--)
{
SortInfo si = sorts[i];
MergeSort<T>.Sort(list, si.Comparer, si.Direction);
}
}
private static SortInfo[] BuildSorts(string sortExpression)
{
Type itemType = typeof(T);
string[] sortArray = sortExpression.Split(',');
int arrayLength = sortArray.Length;
List<SortInfo> comparers = new List<SortInfo>(arrayLength);
for (int i = 0; i < arrayLength; i++)
{
string sortExp = sortArray[i].Trim();
ListSortDirection direction;
string propertyName;
int spacePos = sortExp.IndexOf(' ');
if (spacePos >= 0)
{
propertyName = sortExp.Substring(0, spacePos);
string sortOrder = sortExp.Substring(spacePos + 1);
if (String.Compare(sortOrder, "asc",
StringComparison.OrdinalIgnoreCase) == 0)
direction = ListSortDirection.Ascending;
else if (String.Compare(sortOrder, "desc",
StringComparison.OrdinalIgnoreCase) == 0)
direction = ListSortDirection.Descending;
else
throw new ArgumentException(
"Sort order '" + sortOrder +
"' is invalid.
Must be ASC or DESC.");
}
else
{
propertyName = sortExp;
direction = ListSortDirection.Ascending;
}
PropertyComparer<T> comparer =
new PropertyComparer<T>(propertyName);
comparers.Add(new SortInfo(comparer, direction));
}
return comparers.ToArray();
}
private class SortInfo
{
public SortInfo(PropertyComparer<T> comparer,
ListSortDirection direction)
{
this.Comparer = comparer;
this.Direction = direction;
}
public PropertyComparer<T> Comparer;
public ListSortDirection Direction;
}
}
Using the code
Now we have a complete solution to sort collections by property names using a sql-like expression.
You can either use the CollectionSort in your code directly:
CollectionSort.Sort(myEmployeeCollection, "IsActive desc, Name");
or include the functionality directly into your custom collections:
using System.Collections.Generic.ObjectModel;
public class EmployeeCollection : Collection<Employee>
{
public void Sort(string sortExpression)
{
ObjectSort.Sort(this, sortExpression);
}
}
and use like I proposed in the introduction:
EmployeeCollection employees = GetEmployees();
employees.Sort("IsActive desc, Name");