Click here to Skip to main content
15,860,972 members
Articles / Programming Languages / Visual Basic
Article

Implementing the .NET IComparer interface to get a more natural sort order

Rate me:
Please Sign up or sign in to vote.
4.79/5 (23 votes)
4 Jul 2008CPOL2 min read 95.5K   2.1K   54   16
The IComparer available in .NET lets you sort numbers or strings. This little class available in both C# and VB shows how to implement an IComparer which will work with mixed characters and numbers.

naturalcomparer.png

Introduction

Did you notice how Explorer in XP is intelligent enough to sort the files in a natural order?

If you have 10 files on your hard disk, they will show in this order:

file1.txt

file2.txt
file3.txt

file4.txt
file5.txt

file6.txt
file7.txt

file8.txt
file9.txt

file10.txt

However, if you try in under DOS, they will appear this way:

file1.txt

file10.txt
file2.txt
...

The reason for that is that DOS uses a simple alphabetical search.

The aim of this article is to show how I think Explorer does this better than DOS and provide to the CodeProject readers a class to reproduce this in their .NET programs.

Background

The .NET framework uses the IComparer interface a lot. This interface is very simple to implement; it contains a single member:

VB
'VB
Public Function Compare(ByVal x As Object, ByVal y As Object) As Integer
//C#
int IComparer.Compare(object x, object y)

The function you must provide returns an integer which must be:

  • less than zero when x is less than y
  • zero when x equals y
  • more than zero when x is greater than y

Comparing a null reference (Nothing in Visual Basic) with any reference type is allowed, and does not generate an exception. A null reference is considered to be less than any reference that is not null.

Using the code

The class NaturalComparer can be used anywhere the .NET framework requires an IComparer. That is about every operation which involves sorting data. The most common is probably Array.Sort.

In the demo program, for example, I use:

VB
Array.Sort(lines, New NaturalComparer())

This will sort the lines using the natural order I have briefly described above.

How does this work

This NaturalComparer class uses a couple of StringParser classes to compare each item.

The StringParser is a pretty straightforward character parser. It eats the characters of the string, and returns through its member NextToken a series of tokens. Each token is either numerical or string.

C#
public void NextToken()
{
  do
  {
    if (mCurChar == '\0')
    {
      mTokenType = NaturalComparer.TokenType.Nothing;
      mStringValue = null;
      return; 
    }
    else if (char.IsDigit(mCurChar))
    {
      mTokenType = NaturalComparer.TokenType.Numerical;
      ParseNumericalValue();
      return; 
    }
    else if (char.IsLetter(mCurChar))
    {
      mTokenType = NaturalComparer.TokenType.String;
      // This can also optionally return
      // numericals in case of Roman Numerals
      ParseString();
      return; 
    }
    else
    {
      // Ignore this character and loop some more 
      NextChar();
    }
  } while (true);
}

The StringParser ignores the punctuations and spaces.

The NaturalComparer has very little to do. It will get the first token from each string and compares their numerical values if both are numerical, if not compares the string values.

C#
int System.Collections.Generic.IComparer<string>.Compare(string string1, 
                                                 string string2)
{
  mParser1.Init(string1);
  mParser2.Init(string2);
  int result;
  do
  {
    if (mParser1.TokenType == TokenType.Numerical & 
                 mParser2.TokenType == TokenType.Numerical)
      // both string1 and string2 are numerical 
      result = decimal.Compare(mParser1.NumericalValue, mParser2.NumericalValue);
    else
      result = string.Compare(mParser1.StringValue, mParser2.StringValue);
    if (result != 0) return result;
    else
    {
      mParser1.NextToken();
      mParser2.NextToken();
    }
  } while (!(mParser1.TokenType == TokenType.Nothing & 
             mParser2.TokenType == TokenType.Nothing));  
  return 0; //identical 
}

Points of interest

As an option, you can ask the NaturalComparer to detect and parse Roman numerals.

VB
New NaturalComparer(NaturalComparerOptions.RomanNumbers)

The problem is that sometimes the comparer could mix a valid English name for a Roman number. This is okay if the other side of the comparison is a string, but it can mess your sort order if the other side is a number or another false Roman numeral positive.

So, use this option if you believe the likelihood of having Roman numerals is worth messing the order.

History

  • 2008 Jan 17: First release.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer (Senior)
France France
I am a French programmer.
These days I spend most of my time with the .NET framework, JavaScript and html.

Comments and Discussions

 
GeneralI've been searching for something like this Pin
ledtech321-Dec-19 14:40
ledtech321-Dec-19 14:40 
QuestionBound Listbox Sort Pin
Member 1158819110-Aug-16 0:50
Member 1158819110-Aug-16 0:50 
Praisevery nice Pin
Member 1139005617-May-16 16:07
Member 1139005617-May-16 16:07 
GeneralRe: very nice Pin
Pascal Ganaye19-May-16 7:06
Pascal Ganaye19-May-16 7:06 
QuestionHow do you implement the NaturalComparer to sort a Class Array ? Pin
Member 30077532-Jul-13 23:04
Member 30077532-Jul-13 23:04 
QuestionSorting a Class Array Pin
Member 300775328-Jun-13 0:23
Member 300775328-Jun-13 0:23 
AnswerFinal C# version? Pin
Jack Diamond7-May-12 22:39
Jack Diamond7-May-12 22:39 
GeneralRe: Final C# version? Pin
Jack Diamond7-May-12 23:05
Jack Diamond7-May-12 23:05 
GeneralBug Pin
chemical_e782-Aug-10 0:14
chemical_e782-Aug-10 0:14 
GeneralRe: Bug Pin
Pascal Ganaye2-Aug-10 12:07
Pascal Ganaye2-Aug-10 12:07 
GeneralThanks Pin
Pieter Muller3-May-10 4:28
Pieter Muller3-May-10 4:28 
GeneralThis is great! But..... Pin
nfiskeo18-Dec-08 3:42
nfiskeo18-Dec-08 3:42 
GeneralThanks Pin
Kahiko8-Oct-08 7:34
Kahiko8-Oct-08 7:34 
GeneralEven more natural PinPopular
Seth Morris22-Jan-08 19:27
Seth Morris22-Jan-08 19:27 
GeneralRe: Even more natural Pin
Pascal Ganaye4-Jul-08 2:01
Pascal Ganaye4-Jul-08 2:01 
GeneralRe: Even more natural Pin
supercat94-Jul-08 8:59
supercat94-Jul-08 8:59 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.