|

Introduction
The bit of code below is an in-place sorting routine that, by and large, is faster than the Shell Sort - nearly as fast as Quick Sort, and in some cases faster. It is stable, easy to implement, is not recursive, and does not use the stack. There is little difference in its speed no matter how the list/array is ordered. Overall, I believe you will find it very useful to use for any sorting needs. Any feedback would be appreciated.
Background
As I laid in bed one night, thinking about a couple of sorting routines I had used in a recent project, I began to tinker with the idea of creating my own routine. Since a computer is much like a person without vision, I tried to imagine how (as a blind person) I might sort a list of items in Braille. In a matter of a few minutes, the algorithm below occurred to me. I got up, wrote it down in rough form, and then fell asleep. The next morning I set up a simple program to test the algorithm - and it worked beautifully.
As the code comments below indicate, we start by using a simple scan of the list to see if it is already sorted. If not, we start scanning our list below the first item to find the smallest item. Once the smallest item is found, we compare it to our current list item. If the smallest item found is smaller than our current item then we perform a swap. We then move down to the next item in our list/array and repeat the process and keep doing this until we are done.
The process is so simple that I'm surprised no one has thought of it before. As the sort progresses, the list gets shorter and goes faster and faster ...
I think you will find that the swapping of items is reduced to an absolute minimum in this method. On unsorted lists, it performs very well indeed, - but on lists that are very nearly sorted, you should probably use the Shell sort. I made an initial attempt to incorporate the Shell Sort (and a modified/enhanced version of it), but had mixed results (which depended on how sorted the list already was).
I'd love to get some feedback on my approach - especially if you can suggest any improvements. Feel free to use the code, and download the sample demo program and source code and experiment with it.
Please note that I used Microsoft's new Visual Basic. NET 2005 beta program to write and compile the code. With a few tweaks, it should readily run on other software platforms.
Using the code
The code below sorts in ascending format, A to Z. Converting it to descending format is fairly easy (an exercise I will leave to you - unless you need help). As it is listed below, the code is set up to sort strings from a ListBox, but it can easily be adapted to sort numbers or records or whatever you are wanting to sort. Private Sub HumanSort()
Dim i As Integer
Dim numRecs As Integer
Dim tempRec As String
Dim NextSmallRec As String
Dim NextSmallIndex As Integer
Dim CurrentRec As String
Dim numsorted As Integer
Dim Ascending As Integer
numRecs = ListBox1.Items.Count - 1
numsorted = 0
Ascending = 0
If numRecs < 2 Then
GoTo VeryEnd
Else
For i = 0 To numRecs - 1
If ListBox1.Items(i) <= ListBox1.Items(i + 1) Then
Ascending = Ascending + 1
End If
Next
End If
If Ascending = numRecs Then GoTo VeryEnd
Do While numsorted < numRecs
CurrentRec = ListBox1.Items(numsorted)
NextSmallIndex = numsorted
NextSmallRec = ListBox1.Items(NextSmallIndex)
For i = (numsorted + 1) To numRecs
If NextSmallRec > ListBox1.Items(i) Then
NextSmallIndex = i
NextSmallRec = ListBox1.Items(i)
End If
Next
If NextSmallIndex >= numsorted + 1 Then
If CurrentRec > NextSmallRec Then
tempRec = ListBox1.Items(numsorted)
ListBox1.Items(numsorted) = ListBox1.Items(NextSmallIndex)
ListBox1.Items(NextSmallIndex) = tempRec
ElseIf CurrentRec = NextSmallRec Then
numsorted = numsorted + 1
tempRec = ListBox1.Items(numsorted)
ListBox1.Items(numsorted) = ListBox1.Items(NextSmallIndex)
ListBox1.Items(NextSmallIndex) = tempRec
End If
End If
numsorted = numsorted + 1
Loop
VeryEnd:
End Sub
Points of Interest
If you download the demo code, you will also find two implementations of the Shell sort. One is fairly generic, but the other uses an array of predetermined gap sizes that outperforms any other such predetermined gap list I've seen.
The demo lets you sort though a list of 10,000 numbers (sorted as strings) using the Human sort, the generic Shell sort and an enhanced Shell sort. You can opt to select a randomly generated list, a descending list, an ascending list, a list that is half ascending and half descending, one that is 75% descending and 25% random, and one that is 75% ascending and 25% random.
The sorting times for each method are shown in seconds and milliseconds. As I said at the beginning, the sorting time for the Human sort is fairly constant - no matter how the list/array is ordered.
History
This is version 1.0.
| You must Sign In to use this message board. |
|
| | Msgs 1 to 22 of 22 (Total in Forum: 22) (Refresh) | FirstPrevNext |
|
 |
|
|
A few years ago, I coded an implementation of selection sort which uses O(NlgN) comparisons. It actually uses slightly fewer comparisons than the standard C library qsort() routine, though at the expense of greater bookkeeping cost. If one were handling data types where comparisons were expensive, or if one needed to have the first items in the list available sooner than the last items, it might be useful, but otherwise not.
I call it "tournament sort", since it represents the optimal means of finding the first 'n' places in a pairwise competition. An item "wins" a comparison if it should be output before the other one; other related terms ("loser", "champion", etc.) have their naturally-related meanings.
Each item needs a few properties: - Status: eligible, ineligible, or dead (already output) - LastBeat: The last item it beat, if any - Superiority: How many items is it known to be better than ('superiority count')
Initially, all items are eligible, have beaten nobody, and are better than zero items.
Outer loop (once for each item) - Inner loop: (once for each eligible item, minus one) - - Compare the two eligible items whose superiority is lowest and compare them - - - Loser: Make item ineligible - - - Winner: Add loser's superiority count, plus one, to one's own. Set LastBeat to loser. - Output champion, set status to dead - Traverse list of LastBeat items starting at winner and make those items eligible
In the inner loop, it's okay to compare items which are close to having the lowest superiority, even if they're not quite the lowest. In my test implementation, I used lists for each power-of-two superiority level (a list for 0, 1, 2-3, 4-7, 8-15, etc.) and that seemed to work well.
The first time the inner loop runs, all items will be eligible. The second time, lgN items will be eligible. On subsequent passes, the number of eligible items will vary, but it will never be greater than lgN.
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
 |
|
|
After taking data structures in college I was alway facinated with the different sort methods and how they perform. Radix sort is one of the more interesting sorting methods I've seen. http://en.wikipedia.org/wiki/Radix_sort Worst case it is simply an N*ln(N) type sort. Best case it approaches k*N where k is the index length. I'd expect SQL querries use something similar to this for indexed field sorts and a combination of N*ln(N) methods for non-indexed field sorts. It would be a great if someone with direct knowledge would clarify how SQL sorts. And maybe someone could post some indexed sort code in the interest of spreading the knowledge.
|
| Sign In·View Thread·PermaLink | 1.33/5 (2 votes) |
|
|
|
 |
|
|
 |
|
|
If you liked this, wait til you see the article I have planned for next week.
I've invented an algorithm that'll put all these sorted numbers back in random order faster than anything you've ever seen!
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
Your algorithm is called Selection Sort and it is very very well known! So you're not the first to come up with it. I'm surprised you're familiar with QuickSort and ShellSort but not with Selection Sort!! You should always check the literature before claiming something is your own.
|
| Sign In·View Thread·PermaLink | 1.75/5 (4 votes) |
|
|
|
 |
|
|
 |
|
|
 |
|
|
 |
|
|
BigO(n^2) proves the previous poster's point. That is a poor algorithm. Optimally it would be something like BigO(ln(n)) or BigO(ln(n)*n).
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
You cannot have a sort routine in sub O(n) time. O(ln n) would not allow the routine to check every element in list of size N.
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
Or we could argue about the Big 'O' representation of algorithm speed instead of offering anything constructive
To the OP: Good work but yes, your algorithm has already been prescribed in Selection Sort. Nonetheless, keep up the good work. Posts like this churn ideas and invoke quality thought processes.
|
| Sign In·View Thread·PermaLink | 2.00/5 (1 vote) |
|
|
|
 |
|
|
This seems to be a good example of the type of sorting algorithm it is. Unfortunately, that type of sort is very slow: O(n^2) performance, which means that it tests roughly every element against evey other element. For short lists, it's fine; for 100 elements, even 1000 elements, it'll be acceptable in a lot of cases. But for long lists, 10,000, 100,000, or more, it's unacceptably slow. You can do a lot better with something like a heapsort or a quicksort, both of which can be done in-place without blowing your stack away.
As it happens, I don't have VB on this new laptop yet, so I translated HumanSort into Java to do some simple tests against (1) a brute-force sort routine I thought was great when I wrote it (back when disco was big), and (2) the built-in tuned Quicksort routine shipped with Java. All times are in milliseconds.
For 100 items, too fast to register a difference. 0 milliseconds all around. For 1,000 items, still fast, but starting to show some difference. <20 ms. For 10,000 items, HumanSort does about twice as well as the brute-force sort. Unfortunately, it also does >50 times worse than the quicksort. But, may be OK. For 100,000 random numbers in a list, neither brute-force nor HumanSort were anywhere close to fast. Quicksort, though, was still comfortably under 1 sec. I would be happy to sort >1,000,000 items with the quicksort. The other sorts, though, just won't hack it for arrays that big.
Numbers as printed by the program: C:\Documents and Settings\cburke\My Documents>java SimpleSortTest Sorted as strings (selection/bubble sort): 0 milliseconds HumanSort time for 100 items: 0 milliseconds. Sorted as longs: 0 milliseconds Sorted as Strings: 0 milliseconds
C:\Documents and Settings\cburke\My Documents>java SimpleSortTest Sorted as strings (selection/bubble sort): 20 milliseconds HumanSort time for 1000 items: 20 milliseconds. Sorted as longs: 0 milliseconds Sorted as Strings: 0 milliseconds
C:\Documents and Settings\cburke\My Documents>java SimpleSortTest Sorted as strings (selection/bubble sort): 2084 milliseconds HumanSort time for 10000 items: 1192 milliseconds. Sorted as longs: 0 milliseconds Sorted as Strings: 20 milliseconds
C:\Documents and Settings\cburke\My Documents>\cC:\Documents and Settings\cburke\My Documents>java SimpleSortTest Sorted as strings (selection/bubble sort): 785683 milliseconds HumanSort time for 100000 items: 499143 milliseconds. Sorted as longs: 40 milliseconds Sorted as Strings: 140 milliseconds
Java source code, if you want to try it yourself: // Everything not copyright Clark May is placed in the public domain.
import java.util.*;
public class SimpleSortTest { private long[] items = new long[10000]; private String[] itemsAsStrings = new String[10000]; public void init() { Random rnd = new Random(); for (int i=0; i<items.length; i++) { items[i] = rnd.nextInt(230543143); itemsAsStrings[i] = ""+items[i]; } } public void n2sort() // do comparison of strings to minimize differences from HumanSort { long startTime = System.currentTimeMillis(); for (int i=0; i<itemsAsStrings.length-1; i++) { for (int j=i+1; j<itemsAsStrings.length; j++) { if (itemsAsStrings[i].compareTo(itemsAsStrings[j]) > 0) { String temp=itemsAsStrings[i]; itemsAsStrings[i]=itemsAsStrings[j]; itemsAsStrings[j] = temp; } } } long endTime = System.currentTimeMillis(); long elapsedTime = endTime-startTime; System.out.println("Sorted as strings (selection/bubble sort): "+elapsedTime+" milliseconds"); } public void sort() { long startTime = System.currentTimeMillis(); Arrays.sort(items); long endTime = System.currentTimeMillis(); long elapsedTime = endTime-startTime; System.out.println("Sorted as longs: "+elapsedTime+" milliseconds"); } public void sortAsStrings() { long startTime = System.currentTimeMillis(); Arrays.sort(itemsAsStrings); long endTime = System.currentTimeMillis(); long elapsedTime = endTime-startTime; System.out.println("Sorted as Strings: "+elapsedTime+" milliseconds"); } private void humanSort() { //HumanSort - Intellectual copyright of Clark Hay - created on 02/15/2005 //This is a unique and very fast in-place sort. //I wanted something that sorts like a blind human might, but at computer speeds. //The basic idea is to scan the list/array first to see if it is already sorted. //If it is, then we//re done. //If the list/array is almost sorted then you can use the Shell sort. //Otherwise use the Human sort. //Starting with the item below the current item in our list/array - //we scan our list/array to find the smallest item. //When the smallest item is found, we compare it to our current item. //If the smallest item found is smaller than our current item then we perform a swap. //We then move down to the next item in or list/array and repeat the process. //We keep doing this till we are done. //This is a simple and quick process. //I//m surprised no one has thought of it before. //As the sort progresses, the list gets shorter and goes faster and faster ... //The swapping of items is reduced to the absolute minimum in this method. //On unsorted lists it performs very well indeed, - //but on lists that are very nearly sorted, you should use the Shell sort. //The Human Sort does not use recursion & doesn//t have stack overflow problems.
int i; int numRecs; String tempRec; String NextSmallRec; int NextSmallIndex; String CurrentRec; int numsorted; int Ascending;
//===> Get number of records in database numRecs = items.length;
//Set our counter to 0 numsorted = 0;
//Set our variables for the timer //(You can remove these lines later - it is just here for the demo program) long StartTime; long EndTime;
//Get the time we start the sort //(You can remove this line later - it is just here for the demo program) StartTime = System.currentTimeMillis();
//*************************************************** //* Here is where we see if our items are sorted * //*************************************************** // // don't have goto available, so will just return when no sort needed. // //Set our counter to zero Ascending = 0;
if (numRecs < 2) return; //===> if only 1 or 0 items, then there is nothing to sort else { //===> See how many items are in descending or ascending order: for (i = 0; i < numRecs - 1; i++) { if (itemsAsStrings[i].compareTo(itemsAsStrings[i + 1]) <= 0) { Ascending = Ascending + 1; } } }
//===> if our list is already sorted in ascending order, then there is nothing to sort if (Ascending == numRecs) return;
//I//ve commented out the next 4 lines for a reason: //As you play with the demo sort program, you will notice there are times - //when the ShellSort out-performs the Human Sort and the ShellSortEnhanced. //I haven//t yet determined which sort to use. //It all depends on a number of factors that I have yet to figure out. //If Ascending > Int(numRecs * 0.9) Then //ShellSort() //GoTo VeryEnd //End If
//===> otherwise we//ll default to our ascending Human sort routine: //*************************************************** //* This is the ascending version of the Human sort * //*************************************************** while (numsorted < numRecs) { //===> Get the zip code from the current record in our list CurrentRec = itemsAsStrings[numsorted]; NextSmallIndex = numsorted; NextSmallRec = itemsAsStrings[NextSmallIndex];
for (i = (numsorted + 1); i< numRecs; i++) { //===> see if the next zipcode down is > than what we//ve found so far if (NextSmallRec.compareTo(itemsAsStrings[i]) > 0) { //===> If we//ve found something smaller or = then get it//s index NextSmallIndex = i; //===> and record it//s value to see if anything is smaller than it NextSmallRec = itemsAsStrings[i]; } }
if (NextSmallIndex >= numsorted + 1) { //===> Now that we know the smallest item in our list, we see if we need to swap if (CurrentRec.compareTo(NextSmallRec) > 0) { tempRec = itemsAsStrings[numsorted]; itemsAsStrings[numsorted] = itemsAsStrings[NextSmallIndex]; itemsAsStrings[NextSmallIndex] = tempRec; } else if (CurrentRec.equals(NextSmallRec)) { numsorted = numsorted + 1; // tempRec = itemsAsStrings[numsorted]; itemsAsStrings[numsorted] = itemsAsStrings[NextSmallIndex]; itemsAsStrings[NextSmallIndex] = tempRec; } }
//===> Now we can increment the number of records we have sorted numsorted = numsorted + 1;
//===> and we keep going till we are done }
//VeryEnd: //Lastly we get the ending time of our sort and calculate the time it took EndTime = System.currentTimeMillis(); StartTime = EndTime - StartTime; //Now we can display the seconds it took System.out.println("HumanSort time for "+items.length+" items: "+StartTime+" milliseconds."); } static public void main(String[] args) { SimpleSortTest sst = new SimpleSortTest(); sst.init(); sst.n2sort(); sst.init(); sst.humanSort(); sst.init(); sst.sort(); sst.sortAsStrings(); } }
|
| Sign In·View Thread·PermaLink | 4.50/5 (3 votes) |
|
|
|
 |
 | Neat! |  | Lou Zher | 15:10 29 Mar '05 |
|
|
While it is true that part of the sort is described as a selection sort, it is kinda cool to play around with optimizations like this. I would encourge you to read about sorting... there are whole books on the topic. (Knuth: http://www.amazon.com/exec/obidos/tg/detail/-/0201896850) The neat thing about sorting algorithms is that some work better than others in some situations... Often, as you have probably found, sometimes it's best to combine algorithms to handle a situation when certain criteria are known about the sorted-ness of the data.
Selection sort, or "Min Scan/Exchange Sort" as I called it when I "invented" it, has a very nice property in that it involves a very low and predictable number of exchanges (writes) which is nice in certain situations where the read cost is much lower than write costs (as would be the case in EEPROMs or Flash memory).
Just to be clear: I did NOT actually invent this sort algorithm... that's why invent was in quotes; implying so-called.
One of the things that always irked me about quicksort, which don't get me wrong is a very cool algorithm, is that it takes a stupid amount of time to sort a nearly sorted list. I played around a lot with pivotpoint picker algorithms to try to optimize list halving as well; to limit the recursion a bit. A common trick is to only recursively call quicksort on the small list half rather than both as is the textbook way, then use a loop wrap to reprocess the larger remaining part at the same recursion level.
Here's an interesting connection: Supposedly the bubble sort also arose from a modeling of how humans sort. Imagine that you are trying to sort a deck of cards. You would start with your unsorted pile and start picking the top one off that pile and placing it in your hand in the proper position. Bubble sort does this by walking the 'card' through your 'hand' until it sees a card lower than the one being walked. It then grabs the next card.
Have fun and don't let the fact that "it's already been done" discourage you. In fact, it should encourage you because your idea was so good that it is used all the time... it's just someone came up with it already. I call that validation! -LZ
|
| Sign In·View Thread·PermaLink | 5.00/5 (1 vote) |
|
|
|
 |
|
|
Good answer! 
It reminds me of when I started programming, it did not realy matter that I was reinventing a few wheels. What I was learning was far more important. I "invented" the most commonly used (well known) line drawing algorithm (forgot name), how was I to know someone had wrote a paper on it years ealier (no Web then). All it took was some graphing paper and an understanding of basic algebra.
INTP Every thing is relative...
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
 |
|
|
 |
|
|
I think bubble-sort works with more swapping of elements. bubblesort always swaps two adjacenced elements depending of the ordering, selection sort first scan the whole list and selects two members to swap and do only one swap per run...
|
| Sign In·View Thread·PermaLink | 4.00/5 (1 vote) |
|
|
|
 |
|
|
I learned bubble sort in college (1973), and it's pretty much the same as the "human sort" above, at least as I learned it...
Barbara Crane Auction Systems www.auctionsystems.com Fundraising Auction Software
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
I've used a similar sorting algoritm, but with a small twist to make it even faster.
Run through the list, finding the smallest AND the largest item, then swap them to the top and the bottom. Then continue to run through the unsorted remainder of the list untli you are done.
It will not take much longer to check for both smallest and largest, but it will cut the number of runs through the list in half and will make the average run half as long, resulting in an almost x4 increase in speed.
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
I believe you will find this to be the same thing. The only difference is you are turning half the compares to find the smaller number into ones that find the larger number and you have to make 2 swaps, half for the smaller number and half for the larger number. Another way of saying this is steping through the list from one end to the other to pull out the next smallest in the unsorted list will take the same number of steps as steping in from each end one after another.
|
| Sign In· | | | | | |