 |
|
 |
A few years ago, I coded an implementation of selection sort which uses O(NlgN) comparisons. It actually uses slightly fewer comparisons than the standard C library qsort() routine, though at the expense of greater bookkeeping cost. If one were handling data types where comparisons were expensive, or if one needed to have the first items in the list available sooner than the last items, it might be useful, but otherwise not.
I call it "tournament sort", since it represents the optimal means of finding the first 'n' places in a pairwise competition. An item "wins" a comparison if it should be output before the other one; other related terms ("loser", "champion", etc.) have their naturally-related meanings.
Each item needs a few properties:
- Status: eligible, ineligible, or dead (already output)
- LastBeat: The last item it beat, if any
- Superiority: How many items is it known to be better than ('superiority count')
Initially, all items are eligible, have beaten nobody, and are better than zero items.
Outer loop (once for each item)
- Inner loop: (once for each eligible item, minus one)
- - Compare the two eligible items whose superiority is lowest and compare them
- - - Loser: Make item ineligible
- - - Winner: Add loser's superiority count, plus one, to one's own. Set LastBeat to loser.
- Output champion, set status to dead
- Traverse list of LastBeat items starting at winner and make those items eligible
In the inner loop, it's okay to compare items which are close to having the lowest superiority, even if they're not quite the lowest. In my test implementation, I used lists for each power-of-two superiority level (a list for 0, 1, 2-3, 4-7, 8-15, etc.) and that seemed to work well.
The first time the inner loop runs, all items will be eligible. The second time, lgN items will be eligible. On subsequent passes, the number of eligible items will vary, but it will never be greater than lgN.
|
|
|
|
 |
|
 |
The Human Sort is selection sort?! I think it exactly is??
|
|
|
|
 |
|
 |
After taking data structures in college I was alway facinated with the different sort methods and how they perform.
Radix sort is one of the more interesting sorting methods I've seen.
http://en.wikipedia.org/wiki/Radix_sort
Worst case it is simply an N*ln(N) type sort. Best case it approaches k*N where k is the index length.
I'd expect SQL querries use something similar to this for indexed field sorts and a combination of N*ln(N) methods for non-indexed field sorts.
It would be a great if someone with direct knowledge would clarify how SQL sorts.
And maybe someone could post some indexed sort code in the interest of spreading the knowledge.
|
|
|
|
 |
|
 |
Keep up the good work. We need more articles like this one.
TL Wallace
http://www.tlwallace.net
|
|
|
|
 |
|
 |
If you liked this, wait til you see the article I have planned for next week.
I've invented an algorithm that'll put all these sorted numbers back in random order faster than anything you've ever seen!
|
|
|
|
 |
|
 |
Your algorithm is called Selection Sort and it is very very well known! So you're not the first to come up with it. I'm surprised you're familiar with QuickSort and ShellSort but not with Selection Sort!! You should always check the literature before claiming something is your own.
|
|
|
|
 |
|
 |
This looks like the Shell-Metzner sort routine that I learned using Fortran back in 1980.
http://www.nist.gov/dads/HTML/shellsort.html
At least is is now converted for .NET.
Per article, it takes only 10 lines of code in Fortran.
|
|
|
|
 |
|
 |
Take a Datastructures course. This algorithm performs BigO(n) poorly.
|
|
|
|
 |
|
 |
Maybe you should too before posting. It's BigO(n^2).
|
|
|
|
 |
|
 |
BigO(n^2) proves the previous poster's point. That is a poor algorithm. Optimally it would be something like BigO(ln(n)) or BigO(ln(n)*n).
|
|
|
|
 |
|
 |
You cannot have a sort routine in sub O(n) time. O(ln n) would not allow the routine to check every element in list of size N.
|
|
|
|
 |
|
 |
Or we could argue about the Big 'O' representation of algorithm speed instead of offering anything constructive
To the OP: Good work but yes, your algorithm has already been prescribed in Selection Sort. Nonetheless, keep up the good work. Posts like this churn ideas and invoke quality thought processes.
|
|
|
|
 |
|
 |
This seems to be a good example of the type of sorting algorithm it is. Unfortunately, that type of sort is very slow: O(n^2) performance, which means that it tests roughly every element against evey other element. For short lists, it's fine; for 100 elements, even 1000 elements, it'll be acceptable in a lot of cases. But for long lists, 10,000, 100,000, or more, it's unacceptably slow. You can do a lot better with something like a heapsort or a quicksort, both of which can be done in-place without blowing your stack away. As it happens, I don't have VB on this new laptop yet, so I translated HumanSort into Java to do some simple tests against (1) a brute-force sort routine I thought was great when I wrote it (back when disco was big), and (2) the built-in tuned Quicksort routine shipped with Java. All times are in milliseconds. For 100 items, too fast to register a difference. 0 milliseconds all around. For 1,000 items, still fast, but starting to show some difference. <20 ms. For 10,000 items, HumanSort does about twice as well as the brute-force sort. Unfortunately, it also does >50 times worse than the quicksort. But, may be OK. For 100,000 random numbers in a list, neither brute-force nor HumanSort were anywhere close to fast. Quicksort, though, was still comfortably under 1 sec. I would be happy to sort >1,000,000 items with the quicksort. The other sorts, though, just won't hack it for arrays that big. Numbers as printed by the program: C:\Documents and Settings\cburke\My Documents>java SimpleSortTest Sorted as strings (selection/bubble sort): 0 milliseconds HumanSort time for 100 items: 0 milliseconds. Sorted as longs: 0 milliseconds Sorted as Strings: 0 milliseconds C:\Documents and Settings\cburke\My Documents>java SimpleSortTest Sorted as strings (selection/bubble sort): 20 milliseconds HumanSort time for 1000 items: 20 milliseconds. Sorted as longs: 0 milliseconds Sorted as Strings: 0 milliseconds C:\Documents and Settings\cburke\My Documents>java SimpleSortTest Sorted as strings (selection/bubble sort): 2084 milliseconds HumanSort time for 10000 items: 1192 milliseconds. Sorted as longs: 0 milliseconds Sorted as Strings: 20 milliseconds C:\Documents and Settings\cburke\My Documents>\cC:\Documents and Settings\cburke\My Documents>java SimpleSortTest Sorted as strings (selection/bubble sort): 785683 milliseconds HumanSort time for 100000 items: 499143 milliseconds. Sorted as longs: 40 milliseconds Sorted as Strings: 140 milliseconds Java source code, if you want to try it yourself: // Everything not copyright Clark May is placed in the public domain. import java.util.*; public class SimpleSortTest { private long[] items = new long[10000]; private String[] itemsAsStrings = new String[10000]; public void init() { Random rnd = new Random(); for (int i=0; i<items.length; i++) { items[i] = rnd.nextInt(230543143); itemsAsStrings[i] = ""+items[i]; } } public void n2sort() // do comparison of strings to minimize differences from HumanSort { long startTime = System.currentTimeMillis(); for (int i=0; i<itemsAsStrings.length-1; i++) { for (int j=i+1; j<itemsAsStrings.length; j++) { if (itemsAsStrings[i].compareTo(itemsAsStrings[j]) > 0) { String temp=itemsAsStrings[i]; itemsAsStrings[i]=itemsAsStrings[j]; itemsAsStrings[j] = temp; } } } long endTime = System.currentTimeMillis(); long elapsedTime = endTime-startTime; System.out.println("Sorted as strings (selection/bubble sort): "+elapsedTime+" milliseconds"); } public void sort() { long startTime = System.currentTimeMillis(); Arrays.sort(items); long endTime = System.currentTimeMillis(); long elapsedTime = endTime-startTime; System.out.println("Sorted as longs: "+elapsedTime+" milliseconds"); } public void sortAsStrings() { long startTime = System.currentTimeMillis(); Arrays.sort(itemsAsStrings); long endTime = System.currentTimeMillis(); long elapsedTime = endTime-startTime; System.out.println("Sorted as Strings: "+elapsedTime+" milliseconds"); } private void humanSort() { //HumanSort - Intellectual copyright of Clark Hay - created on 02/15/2005 //This is a unique and very fast in-place sort. //I wanted something that sorts like a blind human might, but at computer speeds. //The basic idea is to scan the list/array first to see if it is already sorted. //If it is, then we//re done. //If the list/array is almost sorted then you can use the Shell sort. //Otherwise use the Human sort. //Starting with the item below the current item in our list/array - //we scan our list/array to find the smallest item. //When the smallest item is found, we compare it to our current item. //If the smallest item found is smaller than our current item then we perform a swap. //We then move down to the next item in or list/array and repeat the process. //We keep doing this till we are done. //This is a simple and quick process. //I//m surprised no one has thought of it before. //As the sort progresses, the list gets shorter and goes faster and faster ... //The swapping of items is reduced to the absolute minimum in this method. //On unsorted lists it performs very well indeed, - //but on lists that are very nearly sorted, you should use the Shell sort. //The Human Sort does not use recursion & doesn//t have stack overflow problems. int i; int numRecs; String tempRec; String NextSmallRec; int NextSmallIndex; String CurrentRec; int numsorted; int Ascending; //===> Get number of records in database numRecs = items.length; //Set our counter to 0 numsorted = 0; //Set our variables for the timer //(You can remove these lines later - it is just here for the demo program) long StartTime; long EndTime; //Get the time we start the sort //(You can remove this line later - it is just here for the demo program) StartTime = System.currentTimeMillis(); //*************************************************** //* Here is where we see if our items are sorted * //*************************************************** // // don't have goto available, so will just return when no sort needed. // //Set our counter to zero Ascending = 0; if (numRecs < 2) return; //===> if only 1 or 0 items, then there is nothing to sort else { //===> See how many items are in descending or ascending order: for (i = 0; i < numRecs - 1; i++) { if (itemsAsStrings[i].compareTo(itemsAsStrings[i + 1]) <= 0) { Ascending = Ascending + 1; } } } //===> if our list is already sorted in ascending order, then there is nothing to sort if (Ascending == numRecs) return; //I//ve commented out the next 4 lines for a reason: //As you play with the demo sort program, you will notice there are times - //when the ShellSort out-performs the Human Sort and the ShellSortEnhanced. //I haven//t yet determined which sort to use. //It all depends on a number of factors that I have yet to figure out. //If Ascending > Int(numRecs * 0.9) Then //ShellSort() //GoTo VeryEnd //End If //===> otherwise we//ll default to our ascending Human sort routine: //*************************************************** //* This is the ascending version of the Human sort * //*************************************************** while (numsorted < numRecs) { //===> Get the zip code from the current record in our list CurrentRec = itemsAsStrings[numsorted]; NextSmallIndex = numsorted; NextSmallRec = itemsAsStrings[NextSmallIndex]; for (i = (numsorted + 1); i< numRecs; i++) { //===> see if the next zipcode down is > than what we//ve found so far if (NextSmallRec.compareTo(itemsAsStrings[i]) > 0) { //===> If we//ve found something smaller or = then get it//s index NextSmallIndex = i; //===> and record it//s value to see if anything is smaller than it NextSmallRec = itemsAsStrings[i]; } } if (NextSmallIndex >= numsorted + 1) { //===> Now that we know the smallest item in our list, we see if we need to swap if (CurrentRec.compareTo(NextSmallRec) > 0) { tempRec = itemsAsStrings[numsorted]; itemsAsStrings[numsorted] = itemsAsStrings[NextSmallIndex]; itemsAsStrings[NextSmallIndex] = tempRec; } else if (CurrentRec.equals(NextSmallRec)) { numsorted = numsorted + 1; // tempRec = itemsAsStrings[numsorted]; itemsAsStrings[numsorted] = itemsAsStrings[NextSmallIndex]; itemsAsStrings[NextSmallIndex] = tempRec; } } //===> Now we can increment the number of records we have sorted numsorted = numsorted + 1; //===> and we keep going till we are done } //VeryEnd: //Lastly we get the ending time of our sort and calculate the time it took EndTime = System.currentTimeMillis(); StartTime = EndTime - StartTime; //Now we can display the seconds it took System.out.println("HumanSort time for "+items.length+" items: "+StartTime+" milliseconds."); } static public void main(String[] args) { SimpleSortTest sst = new SimpleSortTest(); sst.init(); sst.n2sort(); sst.init(); sst.humanSort(); sst.init(); sst.sort(); sst.sortAsStrings(); } }
|
|
|
|
 |
 | Neat!  |  | Lou Zher | 15:10 29 Mar '05 |
|
 |
While it is true that part of the sort is described as a selection sort, it is kinda cool to play around with optimizations like this. I would encourge you to read about sorting... there are whole books on the topic. (Knuth: http://www.amazon.com/exec/obidos/tg/detail/-/0201896850) The neat thing about sorting algorithms is that some work better than others in some situations... Often, as you have probably found, sometimes it's best to combine algorithms to handle a situation when certain criteria are known about the sorted-ness of the data.
Selection sort, or "Min Scan/Exchange Sort" as I called it when I "invented" it, has a very nice property in that it involves a very low and predictable number of exchanges (writes) which is nice in certain situations where the read cost is much lower than write costs (as would be the case in EEPROMs or Flash memory).
Just to be clear: I did NOT actually invent this sort algorithm... that's why invent was in quotes; implying so-called.
One of the things that always irked me about quicksort, which don't get me wrong is a very cool algorithm, is that it takes a stupid amount of time to sort a nearly sorted list. I played around a lot with pivotpoint picker algorithms to try to optimize list halving as well; to limit the recursion a bit. A common trick is to only recursively call quicksort on the small list half rather than both as is the textbook way, then use a loop wrap to reprocess the larger remaining part at the same recursion level.
Here's an interesting connection: Supposedly the bubble sort also arose from a modeling of how humans sort. Imagine that you are trying to sort a deck of cards. You would start with your unsorted pile and start picking the top one off that pile and placing it in your hand in the proper position. Bubble sort does this by walking the 'card' through your 'hand' until it sees a card lower than the one being walked. It then grabs the next card.
Have fun and don't let the fact that "it's already been done" discourage you. In fact, it should encourage you because your idea was so good that it is used all the time... it's just someone came up with it already. I call that validation!
-LZ
|
|
|
|
 |
|
 |
Good answer!
It reminds me of when I started programming, it did not realy matter that I was reinventing a few wheels. What I was learning was far more important. I "invented" the most commonly used (well known) line drawing algorithm (forgot name), how was I to know someone had wrote a paper on it years ealier (no Web then). All it took was some graphing paper and an understanding of basic algebra.
INTP
Every thing is relative...
|
|
|
|
 |
|
|
 |
|
 |
isn't this just a bubble sort ???
-= Technomancer =-
|
|
|
|
 |
|
 |
I think bubble-sort works with more swapping of elements. bubblesort always swaps two adjacenced elements depending of the ordering, selection sort first scan the whole list and selects two members to swap and do only one swap per run...
|
|
|
|
 |
|
 |
I learned bubble sort in college (1973), and it's pretty much the same as the "human sort" above, at least as I learned it...
Barbara Crane
Auction Systems
www.auctionsystems.com
Fundraising Auction Software
|
|
|
|
 |
|
 |
I've used a similar sorting algoritm, but with a small twist to make it even faster.
Run through the list, finding the smallest AND the largest item, then swap them to the top and the bottom. Then continue to run through the unsorted remainder of the list untli you are done.
It will not take much longer to check for both smallest and largest, but it will cut the number of runs through the list in half and will make the average run half as long, resulting in an almost x4 increase in speed.
|
|
|
|
 |
|
 |
I believe you will find this to be the same thing. The only difference is you are turning half the compares to find the smaller number into ones that find the larger number and you have to make 2 swaps, half for the smaller number and half for the larger number.
Another way of saying this is steping through the list from one end to the other to pull out the next smallest in the unsorted list will take the same number of steps as steping in from each end one after another.
|
|
|
|
 |
|
 |
Sure, It is the same number of swaps. In fact, it is the theoretical minimum number of swaps. It is not the number of swaps that makes it faster, it is the number of times I have to go through the list to make them and how many comparisions I have to make each time.
If I go through a list of N items, if it is done as the article writer suggests, I have to go through the list N times, with an average of N/2 items each time, making do total items to check N*N/2.
If I do it my way, I will only have to go through the list N/2 times, and since the list is shortened by two items each run, the average length of the list will be half as long as in the previous example, ie N/4. Total items to check is therefore (N*N)/(2*4) = N*N/8.
If you don't believe me, clock it.
If you wan't a really fast sort, it is fairly easy to make a recursive algorithm. I once made one like that, but it bombed out the call stack completely when sorting long lists...
|
|
|
|
 |