Introduction
While searching for how to speed up string sorting in .NET collections, I came across this article: Fast String Sort in C# and F#
I downloaded the StringSort class from the article, tested the speed, and found it sorts 50,000 strings in about 60% less time than the Generic IEnumerable
object sorting methods.
So, to build on the previous project by
Stefan Savev 2, I have derived a new
sfList
class from
List<string>
.
Using the code
For your
List<string>
objects, you simply use sfList instead. The only thing that is different is the method it uses to sort strings.
using System;
using System.Collections.Generic;
using Sorting.CSharpStringSort;
namespace SortTests.Sorting
{
public class sfList : List<string>
{
public sfList() : base() { }
public sfList(int size) : base(size) { }
public sfList(IEnumerable<String> aArray) : base(aArray) { }
public new void Sort()
{
string[] a = this.ToArray();
this.Clear();
StringSort.Sort(ref a);
this.AddRange(a);
}
}
}
From my initial testing, I could see that it is faster to copy the
List
to a
string[]
, clear the list, sort the array using
StringSort.Sort
and refill the list with the result. This is what is done by replacing the
List.Sort
method with my call to
StringSort.Sort
. A very small amount of time is used to do the two extra copies compared to the sort time.
I also tried an overload that allows direct sorting by calling StringSort.Sort(this)
, but this was around 25% slower (best case).
The only required modification to the StringSort
class is to add this method overload for in-place Sort
on string[]
:
public static void Sort(ref string[] inout)
{
InPlaceSort(inout, 0, 0, inout.Length);
}
I put the ref
in it just to make it a unique signature to overload. The original method returns a sorted copy. I also added a bunch of overloads for testing times for various sorts, but these are not required.
Stopwatch Tests
Ticks to sort copy ------------ 132206
Ticks to sort aArray ---------- 111181
Ticks to Sedgewick Sort bArray: 108932
Ticks to sort sList ----------- 196452
Ticks to List.Sort() sList2 --- 288872
Ticks to sfList.Sort() sList3 - 122279
So from these elapsed tick tests, you can see that List.Sort()
(original Microsoft sort) is very slow. The difference between Sedgewick and StringSort is insignificant. Sorting any List
seems to have a big overhead compared to string[]
. Replacing List.Sort()
in a derived class yields the best solution.
History
First published July 4, 2012
Dan Randolph is currently a Web Applications Developer with Delcan. Mr. Randolph has a B.S. dergee in Computer Science from the University of Wyoming. He is an active member of the Denver Visual Studio User Group. You can find him posting in the forums on [code.]msdn.microsoft.com and Code Project.