Click here to Skip to main content
Click here to Skip to main content

Sorting Collections by Multiple Properties

, 9 Sep 2008 CPOL
Rate this:
Please Sign up or sign in to vote.
How to sort collections by properties using a SQL-like expression

Introduction

If you work with .NET and commercial applications, you probably have many objects to represent your business entities, like Customers, Employees, Products and so on.

When retrieving data from a data store (like a database, for example), we usually ask for the data in a specific order, so that the underlying engine can do the work to order it for us, and then we populate our objects in the correct order.

There are cases however, that for some reason we may need to rearrange objects in memory, either because we can't afford to go back to the data store (web services can be very slow for example) or because its simpler than changing an existing library to do this.

Sorting items by property in .NET is relatively easy, but doing a "sql-like" sort involving multiple properties in different directions is not a trivial task.

As I always get people asking me how to do it, I dediced to write this article to give an insight about the sort methods available in .NET and introduce you to my classes where you can sort objects by properties using a sql-like statement, using a code like this:

EmployeeCollection employees = GetEmployeeCollection()
employees.Sort("IsActive DESC, Name")

Although people using the .NET Framework 3.5 should be able to do this easily using LINQ, I won't cover LINQ or lambda expressions. I will focus on how to implement this using .NET 2.0 only, which is still widely used. Some people may find it useful for some scenarios on 3.5 though.

In this article, we will work with the "Employee" class as described below:

Background

Comparers

Before going into details on the actual sorting, it is important to understand a basic part of most sorting algorithms: the element comparison. (Note: there are sorting algorithms that are not comparison based, but they are more complex and used in specific cases only. For most situations, comparison sorts will work fine.)

So, it doesn't matter what sort algorithm you choose to use, somewhere in your code you will always need to know if an element should go before or after another element in the list. That functionality is provided in .NET by the IComparer interface.

The IComparer interface is a contract that requires the implementation of the "Compare" method. This method accepts 2 objects and returns 0 if they are equal, a negative number if the first is lower than the second and a positive number if the first is higher than the second.

There is a default implementation of the System.Collections.IComparer interface, the Comparer.Default, that will work with almost any type you will want to compare (like strings, numbers, dates, etc).

In .NET 2.0, microsoft introduced the generic System.Collections.Generic.IComparer<T> and the System.Collections.Generic.Comparison<T> delegate. The IComparer<T> is exactly like the IComparer interface, but is strongly typed. The Comparison<T> delegate is like the IComparer<T>.Compare method, and is handy when you don't want or need to create an extra class. There is also a default implementation of the IComparer<T> implementation, the Comparer<T>.Default.

Array.Sort

The .NET framework provides a way to sort arrays using the Array.Sort method. This method uses the QuickSort algorithm to sort arrays of any type. You can easily sort an array of Employees by the "Name" property by creating a method to compare their names and then using this method as a delegate of type Comparison<T> like this:

private int CompareEmployeeNames(Customer a, Customer b)
{
    return String.Compare(a.Name, b.Name);
}

and use it like this:

Employee[] employees = GetEmployeeArray();
Array.Sort(employees, CompareEmployeeNames);

Since String.Compare already returns the right numeric values for string comparisons, we don't need to do anything else.

You can also use the List<T>.Sort in the same way to sort lists. It will internally use the Array.Sort method.

Sorting by multiple columns

Sort by multiple "columns" is very easy. Imagine you want to sort a collection of our Employee objects by active status (actives first), then by name and then by ID. The idea is to simply apply the order in the reverse order.

The steps to accomplish the sort above would be:

  1. Sort by ID in ascending order
  2. Sort by Name in ascending order
  3. Sort by IsActive in descending order.

So, you must think, if we just apply the sorts in this order using Array.Sort or List<T>.Sort, we are done. Well, not exactly.

The Problem: Array.Sort is Unstable

Array.Sort works fine if you need to sort only by one property or column, and if that is what you need, you should stop reading here. But if you need to order by more columns like you do on SQL (order by field1, field2, field3), you will have a surprise. The problem with Array.Sort is that it is unstable. At this point I need to say: DON'T PANIC! That doesn't mean your program will crash if you use it.

Stability is property of sorting algorithms. An unstable sort algorithm will change the order of the items in the list even if they are already ordered. Stable algorithms in contrast won't change the order of items if they are ordered.

See what happens when we apply Array.Sort to the IsActive column in a list that is already sorted by name:

Since all items have IsActive = True, they are already sorted, and you wouldn't expect the sort to do anything. But remember that Array.Sort uses an unstable algorithm, and unstable algorithms may change the order of sorted items if that makes them run faster.

Imagine we were doing the 3 steps discussed previously. The second sort would change the order of the first, and the third would change the second. That makes Array.Sort pretty useless to sort multi-column data.

The Solution

Implementing the Merge Sort

The first thing to solve here is the sort algorithm. As previously discussed, the Array.Sort uses QuickSort, which is unstable. Since there is no alternative in .NET, we must implement or our own stable sort algorithm.

I decided to implement the Merge Sort, which is stable and has a reasonable speed. You can implement your own sort or use an external library of your choice to do this. But for the sake of simplicity, this implementation is good enough.

Here is a sample of the MergeSort that I implemented:

public sealed class MergeSort<T>
{
    // the comparer to use (our PropertyComparer in this case)
    private IComparer<T> _comparer;

    // used to tell if the sort should be ascending or descending
    private bool _isAscending;

    // the list to sort
    private IList<T> _list;

    // a buffer of the same size of the list to work
    private T[] _buffer;

    // private constructor, used only inside this class by the static Sort method
    private MergeSort
	(IList<T> list, IComparer<T> comparer, ListSortDirection direction)
    {
        _list = list;
        _buffer = new T[list.Count];
        _comparer = comparer;
        _isAscending = (direction == ListSortDirection.Ascending);
    }

    // This is the method you actually call to do the sort
    public static void Sort(IList<T> list, IComparer<T> comparer, 
		ListSortDirection direction)
    {
        MergeSort<T> sort = new MergeSort<T>(list, comparer, direction);
        sort._MergeSort(0, list.Count - 1);
    }

    // This method is used internally to compare values.
    // it will check the order and return the inverse values
    // if you are sorting in descending order
    private int Compare(T x, T y)
    {
        if (_isAscending)
            return _comparer.Compare(x, y);
        else
            return _comparer.Compare(y, x);
    }

    // the recursive part to "divide and conquer"
    private void _MergeSort(int firstIndex, int lastIndex)
    {
        int lastRelativeIndex = lastIndex - firstIndex;

        if (lastRelativeIndex < 1)
            return;

        int middle = (lastRelativeIndex / 2) + firstIndex;
        int postMiddle = middle + 1;

        _MergeSort(firstIndex, middle);
        _MergeSort(postMiddle, lastIndex);

        Merge(firstIndex, middle, postMiddle, lastIndex);
    }

    // the actual sort
    private void Merge(int leftStart, int leftEnd, int rightStart, int rightEnd)
    {
        int bufferIndex = leftStart;
        int leftIndex = leftStart;
        int rightIndex = rightStart;

        // copy to the items we can to the buffer in the right order

        while (leftIndex <= leftEnd && rightIndex <= rightEnd)
        {
            if (Compare(_list[leftIndex], _list[rightIndex]) > 0)
                _buffer[bufferIndex++] = _list[rightIndex++];
            else
                _buffer[bufferIndex++] = _list[leftIndex++];
        }

        // copy the rest of the items to the buffer

        for (int i = leftIndex; i <= leftEnd; i++)
            _buffer[bufferIndex++] = _list[i];

        for (int i = rightIndex; i <= rightEnd; i++)
            _buffer[bufferIndex++] = _list[i];

        // copy the buffer back to the list

        for (int i = leftStart; i <= rightEnd; i++)
            _list[i] = _buffer[i];
    }
}

I won't cover in details how merge sort works, and you can check the details here if you want: http://en.wikipedia.org/wiki/Merge_sort.

Creating a PropertyComparer

Now, in order to sort by properties, we need a way to compare the objects by property name and know which one should go first or last. Instead of creating delegates for every case, I created the PropertyComparer, that given a type and a property name, can compare the values of those properties using reflection.

This class will find the property by name, extract the "get" method and call it to get the value.

This is a sample of how this works:

// gets the property
PropertyInfo pi = t.GetProperty
		(propertyName, BindingFlags.Public | BindingFlags.NonPublic |
                   BindingFlags.Instance | BindingFlags.IgnoreCase);

// gets the "get" acessor for the property
MethodInfo mi = pi.GetGetMethod();

// calls the method in the specific instance to get the value
object value = mi.Invoke(instance, null);

This will enable us to sort by comparing the property values. You can also use this Comparer with Array.Sort if you like.

Parsing the Sort expression

We already have the right sort algorithm and a way to compare the property values. Now, we can just create the comparers and call the MergeSort in the right direction.

PropertyComparer comparer = 
	new PropertyComparer(typeof(Employee), "Name");
SortHelper.MergeSort<employee>(employees, comparer, ListSortDirection.Ascending);

We could just create a bunch of comparers and call the MergeSort method multiple times. But instead, it is a better idea to create some code to parse a sort expression and do what we need. Remember, we want to use an expression like:

IsActive DESC, Name ASC, ID ASC

This is the last and easier part. This is the code for the sort parser:

public class CollectionSort<T>
{
    // This method is what you call to do the actual sort
    public static void Sort(IList<T> list, string sortExpression)
    {
        // builds the sort expressions into classes
        SortInfo[] sorts = BuildSorts(sortExpression);

        // loops in reverse direction and apply each sort
        for (int i = sorts.Length - 1; i >= 0; i--)
        {
            SortInfo si = sorts[i];
            MergeSort<T>.Sort(list, si.Comparer, si.Direction);
        }
    }

    // parses the sort expression and builds the list of property comparers 
    // and directions
    private static SortInfo[] BuildSorts(string sortExpression)
    {
        Type itemType = typeof(T);
        string[] sortArray = sortExpression.Split(',');
        int arrayLength = sortArray.Length;
        List<SortInfo> comparers = new List<SortInfo>(arrayLength);

        // loops through each of the sort expressions,
        // parse and add them to the list
        for (int i = 0; i < arrayLength; i++)
        {
            string sortExp = sortArray[i].Trim();

            ListSortDirection direction;
            string propertyName;
            int spacePos = sortExp.IndexOf(' ');

            // since there shouldn't be spaces on property names, spacePos >= 0
            // means that there should be an ASC or DESC in the expression
            if (spacePos >= 0)
            {
                propertyName = sortExp.Substring(0, spacePos);
                string sortOrder = sortExp.Substring(spacePos + 1);

                // checks the sort order
                if (String.Compare(sortOrder, "asc", 
                        StringComparison.OrdinalIgnoreCase) == 0)
                    direction = ListSortDirection.Ascending;
                else if (String.Compare(sortOrder, "desc", 
                         StringComparison.OrdinalIgnoreCase) == 0)
                    direction = ListSortDirection.Descending;
                else
                    throw new ArgumentException(
                        "Sort order '" + sortOrder + 
                        "' is invalid. 
                        Must be ASC or DESC.");
            }
            else
            {
                // if no space was found, there isn't a direction defined
                // so we default to ascending and the propertyName 
                // is the expression itself
                propertyName = sortExp;
                direction = ListSortDirection.Ascending;
            }

            // creates the comparer using the property name
            PropertyComparer<T> comparer = 
                 new PropertyComparer<T>(propertyName);
            
            // adds the new comparer and the sort direction
            comparers.Add(new SortInfo(comparer, direction));
        }

        // returns the array
        return comparers.ToArray();
    }

    // container class for the sort builder
    private class SortInfo
    {
        public SortInfo(PropertyComparer<T> comparer, 
             ListSortDirection direction)
        {
            this.Comparer = comparer;
            this.Direction = direction;
        }

        public PropertyComparer<T> Comparer;
        public ListSortDirection Direction;
    }
}

Using the code

Now we have a complete solution to sort collections by property names using a sql-like expression.

You can either use the CollectionSort in your code directly:

CollectionSort.Sort(myEmployeeCollection, "IsActive desc, Name");

or include the functionality directly into your custom collections:

using System.Collections.Generic.ObjectModel;

public class EmployeeCollection : Collection<Employee>
{
	public void Sort(string sortExpression)
	{
		ObjectSort.Sort(this, sortExpression);
	}
}

and use like I proposed in the introduction:

EmployeeCollection employees = GetEmployees();
employees.Sort("IsActive desc, Name");

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Natan Vivo

Brazil Brazil
No Biography provided

Comments and Discussions

 
GeneralMy vote of 1 Pinmemberlucho_198116-Feb-09 9:46 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web04 | 2.8.141022.2 | Last Updated 9 Sep 2008
Article Copyright 2008 by Natan Vivo
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid