15,119,322 members
Articles / General Programming / Sorting
Article
Posted 16 Dec 2013

24.4K views
20 bookmarked

# QuickSortByRange, QuickSelectByRange, PartitionByRange

Rate me:
A performance improvement for the classic sort / select algorithms.

## Introduction

The classic QuickSort and QuickSelect algorithms have poor worst-case performance. If bad pivots are consistently chosen, such as decreasing by only a single element each time, then worst-case performance is quadratic: O(n2). Duplicate values in the array are problematic for QuickSort.

This article introduces the QuickSortByRange, QuickSelectByRange, and PartitionByRange algorithms which provide a remedy to the problem with duplicates by returning a pivot range from the PartitionByRange algorithm. This approach works well when the array contains several duplicate values.

For example, if every element of the array contains the same value, then QuickSort delivers O(n2) whereas QuickSortByRange delivers O(n). Considering that the average case performance of QuickSort is O(n log n), the all-duplicates scenario becomes a better than average case scenario for QuickSortByRange as compared to the worse than average case scenario we see with QuickSort.

## Background

QuickSort and QuickSelect both rely on a partition algorithm. The partition algorithm looks at an initial (pivot) value within a range of the array and then reorganizes the values within that range so that all the values less than or equal to the pivot value are placed to the left of the pivot and the values greater than the pivot value are moved to the right of the pivot. The algorithm will move the pivot element, if necessary, to satisfy this quasi-sort and then it returns the final position of the pivot element.

For example, let's consider the following range within a larger array:

[ < , > , == , x , < , > , < , == ]

In this example, 'x' marks the pivot value, '<' indicates values less than the pivot value, '>' indicates values greater than the pivot value, and '==' indicates values equal to the pivot value.

After a single pass through the quickSort partition algorithm, these elements will have been reorganized to:

[ < , == , < , < , == , x , > , > ]

PartitionByRange takes the quasi-sort a step further. It further reorganizes the array so that elements equal to the pivot element are placed to the immediate left of the pivot element. Values less than the pivot value are placed to the left of this range of equal-value elements.

After a single pass through the PartitionByRange algorithm, these elements will have been reorganized to:

[ < , < , < , == , == , x , > , > ]

Instead of returning a single value indicating the final position of the pivot value, the PartitionByRange algorithm returns two values indicating the range of positions which contain the same value as the pivot value. This enables the QuickSortByRange and QuickSelectByRange algorithms to move the boundaries of the next pass through the PartitionByRange further to the right or left than QuickSort and QuickSelect, respectively.

## Using the Code

The code is written in C# for arrays of integers. It's fairly straightforward to port the code to another language or value type.

The attached file is an HttpHandler which is designed to run on the localhost of a machine with IIS. It includes benchmarking code to replicate the results posted at the end of this article.

One important part of the PartitionByRange code is the Stack used to store the locations of values equal to the pivot value. When porting the code to another language, ensure that a LIFO structure is used to store these indexes. An array may be used in place of a stack as long as LIFO order is used.

### PartitionByRange

C#
```public Tuple<int, int> partitionByRange(int[] a,int left,int right,int pivot) {

int pivotValue = a[pivot];
Stack<int> equalValuePositions = null;
swap(a,pivot,right);
for(int i=left;i<right;i++)
if(a[i] <= pivotValue) {
swap(a,left++,i);
if(a[i] == pivotValue) {
if(equalValuePositions == null) equalValuePositions = new Stack<int>();
equalValuePositions.Push(i);
}
}
swap(a,left,right);
var rightPivotIndex = left;
if(equalValuePositions != null) foreach(int i in equalValuePositions) swap(a,--left,i);
return new Tuple<int, int>(left,rightPivotIndex);

}

public void swap(int[] a,int i,int j){ if(i!=j){ int k=a[i]; a[i]=a[j]; a[j]=k; } }```

### QuickSortByRange

C#
```public void quickSortByRange(int[] a) {

qSortByRange(a,0,a.Length - 1,new Random());

}

public void qSortByRange(int[] a,int left,int right,Random r) {

if(left < right) {
int pivot = left + r.Next(right - left + 1);
Tuple<int, int> pivotRange = partitionByRange(a,left,right,pivot);
qSortByRange(a,left,pivotRange.Item1 - 1,r);
qSortByRange(a,pivotRange.Item2 + 1,right,r);
}

}```

### QuickSelectByRange

C#
```public int quickSelectByRange(int[] a,int n) {

return qSelectByRange(a,0,a.Length - 1,n,new Random());

}

public int qSelectByRange(int[] a,int left,int right,int n,Random r) {

if(left == right) return a[left];

int pivot = n;

while(true) {

Tuple<int, int> pivotRange = partitionByRange(a,left,right,pivot);
if(n >= pivotRange.Item1 && n <= pivotRange.Item2) return a[n];
if(n < pivotRange.Item1)
right = pivotRange.Item1 - 1;
else
left = pivotRange.Item2 + 1;
pivot = left + r.Next(right - left + 1);

}

}```

The PartitionByRange algorithm requires a Stack (or similar LIFO structure), extra comparisons, and it returns a tuple of two integers instead of a single integer.

These overhead requirements result in a performance hit which is not overcome until the array contains at least several duplicates.

## Performance

• In arrays with no duplicate entries, the QuickSort algorithm is about 8% faster than the QuickSortByRange algorithm.
• In an array with just a few duplicate entries, the QuickSort algorithm is about 9% faster than the QuickSortByRange algorithm.
• In an array with several duplicates, the QuickSortByRange algorithm is about 14% faster than the QuickSort algorithm.
• In an array with quite a few duplicates, the QuickSortByRange algorithm is about 76% faster than the QuickSort algorithm.
• In an array with lots of duplicates, the QuickSortByRange algorithm is about 98% faster than the QuickSort algorithm.
• In an array with all duplicates, the QuickSortByRange algorithm is about 99.9% faster than the QuickSort algorithm.

## History

In the original version of this article, I incorrectly stated that QuickSortByRange delivers O(1) performance on an array where every element contains the same value.

## Share

 Engineer Comprehend Systems United States
I've been fiddling around with computers since my parents bought me a Commodore VIC-20 for Christmas in 1981.

I received a B.Sc. in Mechanical Engineering from the University of Waterloo, but have spent my career in software development.

I focused on FoxPro during my early career before switching over to .NET and JavaScript.

I created Jooshe and wrote a Code Project article on it.

I wrote the front-end (Jooshe & JavaScript) and middleware (.NET) of dbiScript.

 First Prev Next
 My vote of 2 Qwertie24-Dec-13 7:34 Qwertie 24-Dec-13 7:34
 Um.... Qwertie23-Dec-13 14:15 Qwertie 23-Dec-13 14:15
 Message Closed 23-Dec-13 15:19 Brien Givens 23-Dec-13 15:19
 Re: Um.... Qwertie24-Dec-13 6:04 Qwertie 24-Dec-13 6:04
 Message Closed 24-Dec-13 6:35 Brien Givens 24-Dec-13 6:35
 Re: Um.... Qwertie24-Dec-13 7:32 Qwertie 24-Dec-13 7:32
 Re: Um.... Brien Givens26-Dec-13 8:15 Brien Givens 26-Dec-13 8:15
 Re: Um.... KP Lee4-Jan-14 21:44 KP Lee 4-Jan-14 21:44
 Re: Um.... KP Lee4-Jan-14 21:50 KP Lee 4-Jan-14 21:50
 Copying my code from the article I referenced is a real pain, this should be easier to copy:Copy Code ```/// /// creates a random array of positive/negitive numbers of the specified size /// /// Size of the array to generate /// The created array public static int[] GenRandVals(int sizeArray) { if (sizeArray <= 0) throw (new Exception(string.Format("GenRandVals(int sizeArray) called. sizeArray MUST be > 0. Value= {0} sent.", sizeArray))); int[] vals = new int[sizeArray]; int range = int.MaxValue, diff = range / 2; Random rand = new Random(); for (int i = 0; i < sizeArray; i++) vals[i] = rand.Next(range) - diff; return vals; } /// /// copies values from one array into another one /// /// The array to be copied /// copied array public static int[] CopyArray(int[] ar) { int len = ar.Length; int[] Cpy = new int[len]; for (int i = 0; i < len; i++) Cpy[i] = ar[i]; return Cpy; } static void Main(string[] args) { DateTime start = DateTime.Now; int arraySize = 200000, loc = 0; int[] testArray1 = GenRandVals(arraySize), testArray2 = CopyArray(testArray1), testArray3 = CopyArray(testArray1); TimeSpan span = new TimeSpan(DateTime.Now.Ticks - start.Ticks); Console.WriteLine(string.Format("Time to generate and copy two arrays: {0} of size: {1}", span.ToString(), arraySize)); start = DateTime.Now; BubbleSort(testArray1); span = new TimeSpan(DateTime.Now.Ticks - start.Ticks); Console.WriteLine(string.Format("Time to run original BubbleSort: {0}", span.ToString())); start = DateTime.Now; AltBubbleSort(testArray2); span = new TimeSpan(DateTime.Now.Ticks - start.Ticks); Console.WriteLine(string.Format("Time to run alternate version of BubbleSort: {0}", span.ToString())); Console.WriteLine(string.Format("nOrig/nAlt/nOrig loc. Values of both versions of BubbleSort with orig: {0}/{1}/{2} Location {3}", testArray1[loc], testArray2[loc], testArray3[loc],loc)); loc = arraySize - 1; Console.WriteLine(string.Format("nOrig/nAlt/nOrig loc. Values of both versions of BubbleSort with orig: {0}/{1}/{2} Location {3}", testArray1[loc], testArray2[loc], testArray3[loc], loc)); loc = arraySize / 2; Console.WriteLine(string.Format("nOrig/nAlt/nOrig loc. Values of both versions of BubbleSort with orig: {0}/{1}/{2} Location {3}", testArray1[loc], testArray2[loc], testArray3[loc], loc)); start = DateTime.Now; } /// /// swap two items of an array /// /// int[] array to work a swap /// First of two positions to swap /// Second position private static void Swap(int[] ar, int i1, int i2) { int hold = ar[i1]; ar[i1] = ar[i2]; ar[i2] = hold; } /// /// Binary sort method with custom modifications. Recursive code splits array hopefully in half /// and calls each half to sort them. /// /// int array to be sorted /// Starting index location in the array to be sorted /// Ending index location public static void BinSort(int[] ar, int start, int end) { int middle, middlel, middlevalue, hold, left, right, swapl, swapr; if (start > end) { hold = start; start = end; end = hold; } if (start < 0) start = 0; if (end >= ar.Length) end = ar.Length - 1; middle = (start + end) / 2; left = middle - 1; right = middle + 1; if (ar[start] > ar[end]) Swap(ar, start, end); if (ar[middle] > ar[start] && ar[middle] > ar[end]) Swap(ar, middle, end); else if (ar[middle] < ar[start] && ar[middle] < ar[end]) Swap(ar, middle, start); middlel = middle; middlevalue = ar[middle]; //left moves lower, right moves higher. middlel/middle point to the middle value(s) that won't be sorted when this finishes //start end don't change from this point forward bool finished = false; while (!finished) { while (left >= start && ar[left] < middlevalue) left--; while (right <= end && ar[right] > middlevalue) right++; if (left >= start && ar[left] == middlevalue) Swap(ar, left--, --middlel); if (right <= end && ar[right] == middlevalue) Swap(ar, right++, ++middle); if ((left >= start && ar[left] > middlevalue) && (right <= end && ar[right] < middlevalue)) Swap(ar, left--, right++); if (left < start) { if (right <= end && ar[right] < middlevalue) {//need to move the value on the right to the left side Swap(ar, middlel, ++middle);//Move what used to be to right of the middle value to the left if (right != middle)//after the right is moved to the left both pointers are shifted right Swap(ar, right, middlel++); else middlel++; right++; } if (right > end) finished = true; } else if (right > end) { if (left >= start && ar[left] > middlevalue) { Swap(ar, middle, --middlel); if (left != middlel) Swap(ar, left, middle--); else middle--; left--; } if (left < start) finished = true; } } if (middlel - start > 1) BinSort(ar, start, --middlel); else if (ar[middlel] < ar[start]) Swap(ar, middlel, start); if (end - middle > 1) BinSort(ar, ++middle, end); else if (ar[middle] > ar[end]) Swap(ar, middle, end); } ```
 Re: Um.... KP Lee4-Jan-14 22:09 KP Lee 4-Jan-14 22:09
 Re: Um.... KP Lee5-Jan-14 1:21 KP Lee 5-Jan-14 1:21
 Suggestion lafffer_11923-Dec-13 11:31 lafffer_119 23-Dec-13 11:31
 My vote of 5 John Brett18-Dec-13 0:43 John Brett 18-Dec-13 0:43
 Re: My vote of 5 Brien Givens18-Dec-13 0:48 Brien Givens 18-Dec-13 0:48
 Very nice article about QuickSortByRange, QuickSelectByRange, and PartitionByRange algorithms Volynsky Alex17-Dec-13 11:24 Volynsky Alex 17-Dec-13 11:24
 Re: Very nice article about QuickSortByRange, QuickSelectByRange, and PartitionByRange algorithms Brien Givens17-Dec-13 11:29 Brien Givens 17-Dec-13 11:29
 Re: Very nice article about QuickSortByRange, QuickSelectByRange, and PartitionByRange algorithms Volynsky Alex17-Dec-13 11:37 Volynsky Alex 17-Dec-13 11:37
 Last Visit: 31-Dec-99 19:00     Last Update: 2-Dec-21 14:45 Refresh 1