The ELMO Principle - Part 2 - Sorting Algorithms

DreamInHex

3.44/5 (3 votes)

Dec 20, 2007

CPOL

3 min read

43584

How to use the best algorithm for the job.

Introduction

In this segment of the ELMO principle, I’m going to cover non-recursive sorting algorithms. Some people might say, “The .NET Framework already provides us with array sorting mechanisms and the SortedList generic! Why should I care about sorting algorithms?” Simply put, that is the scope of the ELMO (Extremely Low Memory Optimization) principle! If you can build your own sorting algorithms for your objects, you can take advantage of extremely fast sorting without any additional overhead, boxing or un-boxing. But, in order to do that, you have to know what the most efficient and least efficient methods are for sorting. With sorting algorithms, there is no “one-size-fits-all” approach. Sorting algorithms are more or less effective depending on “what” and “how many” they are sorting.

Background

In this article, I’m going to cover three non-recursive sorting algorithms. Simply put, these are single functions that sort without having to call other functions recursively. The sorting algorithms I’m going to address are Bubble, Shell, and Insertion. The example code I’m going to post illustrates benchmarks of each sorting algorithm, given an integer array of 100,000 pseudorandom integers between 0 and 9 to sort in ascending order. The console application is multi-threaded, so I’m going to post the Main() function first, then detail each static method called by the threads. This is a pretty solid CPU exercise for your PC, so if you build something like this to benchmark, be sure you have a robust PC, and that you have closed all other applications before running this.

Building a Benchmark

    public static Random rand = new Random();

    public static int[] intArray;

    [STAThread]
    static void Main(string[] args)
    {
      Console.WriteLine(“Sorting Benchmark”);
      Console.WriteLine(“====================================\n\r”);

      // Fill array with random ints between 0 and 9
      intArray = new int[100000];

      for (int x = 0; x < 100000; x++)
      {
        intArray[x] = rand.Next(0, 9);
      }

      Console.WriteLine(“Preparing to sort {0} integers:\n\r”,intArray.Length);

      // Create separate threads for each sort
      ThreadStart tsBubble = new ThreadStart(BubbleSort);
      Thread tBubble = new Thread(tsBubble);

      ThreadStart tsShell = new ThreadStart(ShellSort);
      Thread tShell = new Thread(tsShell);

      ThreadStart tsInsertion = new ThreadStart(InsertionSort);
      Thread tInsertion = new Thread(tsInsertion);

      // Start the race
      tBubble.Start();
      tShell.Start();
      tInsertion.Start();

      Console.ReadLine();

    }

First Sort: The Bubble Sort

The Bubble sort is the least efficient of the three sorting algorithms listed here. The Bubble sort proceeds through the array from top to bottom, comparing an item to the one next to it. If it’s larger, it will swap it, and start over. In this way, smaller values “bubble” to the top of the array. A full pass of the array isn’t completed until the Bubble sort gets to the very end of the array without swapping a value. The code for the algorithm looks like the following:

    static void BubbleSort()
    {
      Console.WriteLine(“Bubble sort started…”);
      // Stamp start time

      DateTime dtStart = DateTime.Now;

      int i, j, x, iTemp;

      x = intArray.Length;

      for (i = (x - 1); i >= 0; i–)
      {
        for (j = 1; j <= i; j++)
        {
          if (intArray[j - 1] > intArray[j])
          {
            iTemp = intArray[j - 1];

            intArray[j - 1] = intArray[j];

            intArray[j] = iTemp;
          }
        }
      }

      // Stamp end time
      DateTime dtEnd = DateTime.Now;

      TimeSpan ts = dtEnd.Subtract(dtStart);

      Console.WriteLine(“Bubble sort completed in {0} sec {1} ms.”, 
                        ts.Seconds, ts.Milliseconds);
    }

Second Sort: The Insertion Sort

The Insertion sort is generally at least twice as efficient as the Bubble sort, and usually a better sorting algorithm to use when sorting larger amount of data than is recommended by the Shell sort. This algorithm works by inserting each item into its final place in the array. The simplest implementation (as detailed below) requires two array structures; the source and the sorted array. For large amounts of data, the Insertion sort obliterated the Bubble sort in benchmark tests.

    static void InsertionSort()
    {
      Console.WriteLine(“Insertion sort started…”);
      // Stamp start time
      DateTime dtStart = DateTime.Now;
      int i, j, x, index;
      x = intArray.Length;
      for (i = 1; i < x; i++)
      {
        index = intArray[i];
        j = i;

        while ((j > 0) && (intArray[j - 1] > index))
        {
          intArray[j] = intArray[j - 1];
          j = j - 1;
        }
        intArray[j] = index;
      }

      // Stamp end time
      DateTime dtEnd = DateTime.Now;
      TimeSpan ts = dtEnd.Subtract(dtStart);
      Console.WriteLine(“Insertion sort completed in {0} sec {1} ms.”, 
                        ts.Seconds, ts.Milliseconds);
    }

Third Sort: The Shell Sort

Also known as a Comb sort, the Shell sort uses diminishing increments. It performs multiple passes through the array, and each time sorts equally sized sets using the Insertion sort. As the size of the set increases, the number of sets to be sorted decreases. The sets in a Shell sort are not contiguous. For example, if there are three sets, then a set is composed of every third element in the initial array. The Shell sort is supposed to be at least five times faster than a Bubble sort. In benchmarks, it performed even faster.

    static void ShellSort()
    {
      Console.WriteLine(“Shell sort started…”);

      // Stamp start time
      DateTime dtStart = DateTime.Now;
      int i, j, inc, iTemp, x;
      x = intArray.Length;
      inc = 3;
      while (inc > 0)
      {
        for (i = 0; i < x; i++)
        {
          j = i;
          iTemp = intArray[i];

          while ((j >= inc) && (intArray[j - inc] > iTemp))
          {
            intArray[j] = intArray[j - inc];
            j = j - inc;
          }
          intArray[j] = iTemp;
        }
        if (inc / 2 != 0)
        {
          inc = inc / 2;
        }
        else if (inc == 1)
        {
          inc = 0;
        }
        else
        {
          inc = 1;
        }
      }

      // Stamp end time
      DateTime dtEnd = DateTime.Now;
      TimeSpan ts = dtEnd.Subtract(dtStart);
      Console.WriteLine(“Shell Sort completed in {0} sec {1} ms.”, 
                        ts.Seconds, ts.Milliseconds);
    }

Conclusion

I’ve detailed three extremely useful sorting algorithms for integers. These can be applied rather easily to a string array as well, but I’ll leave you guys to that implementation. When we run the above code as a console application, we get the following benchmark data (for 100,000 records) output to the Console (note, benchmarks will vary based on the system – this particular system is a dual processor 3.0 GHz with 3.5 GB of RAM, running Windows Server 2003):

Sorting Benchmark
====================================

Preparing to sort 100000 integers:

Bubble sort started…
Shell sort started…
Insertion sort started…
Shell Sort completed in 2 sec 453 ms.
Insertion sort completed in 3 sec 234 ms.
Bubble sort completed in 41 sec 764 ms.

Thanks and Credit

I'd like to extend thanks to Public Joe's information base and examples regarding sorting algorithms.