Click here to Skip to main content
Click here to Skip to main content

Faster Sorting through Appropriate Threading on Multi-processor Machines

By , 5 Mar 2009
 

Introduction

The goal was to achieve improved sorting performance by spawning a limited number of threads to use as much of the available processing power on multi-core and/or multi-processor machines as possible. The result is an improvement of 40% on dual core and more than 100% on a 4 processor machine.

Background

As part of a team involved in importing RDBMSs into an Associative Knowledge base, I recently inherited some code that hashed through a large amount of relational data, aggregating it before pushing it onto the server.

This article though, really has nothing to do with that. I was just looking for an excuse to generate heat. In fact I was quite embarrassed that the code was consuming a full 25% of the available processing on the quad core server, so I decided to do something about it.

After 2 months of cursing, I now have a new found respect for threaded architecture. There are pitfalls galore, even if you read all the documents. Nuances, such as code that runs on one machine and not on another. A lock that just doesn't seem to lock, and locks that make your code run like molasses.

The number one lesson I've learned is not to spawn threads unless there is a clear benefit to doing so. Even cheap threads are expensive when you have thousands of them around.

The number two lesson is to minimize contention or to avoid locking if you can.

Okay, so applying these lessons to sorting gives the following:

  1. Split your big unsorted list by the number of available processors.
  2. Spawn one thread per split that sorts them using your favourite sorting routine.
  3. Merge pairs of result lists using Mergesort until you have no more results to merge.

(You will find this logic applied in the ParallelSorter method of Jigger.Parallel<T>).

Perfectionist Disclaimer

I do NOT claim that the source code provided has much merit as far as programming style goes. The sorting logic was extracted from a working class and placed in a test and benchmarking solution for the sole purpose of proving that it was worth doing. Memory consumption was taken into consideration but not to the detriment of performance.

Suggestions

However, having said that, if you have anything (con/de)structive to say, I'm all ears.

Source and Contributing

The zip contains the source, unit test and benchmarking files. I hope you have Visual Studio 2008, otherwise you will have to do some copy-pasting.

You will also find a set of results from my tests on my Dual Core notebook and on a Dual core, Dual processor server. I would be interested to hear from those of you that have other configurations. To add to the list, just install Ntime.

Using the Code

You will find many examples in the Parallel.Sort.Test.ParallelTester class. I have extracted what's pertinent and put it below:

//Initialise the collection
List<KeyValuePair<int, string>> stringDupesList= 
     new List<KeyValuePair<int, string>>(); 

// Load up a List of KeyValuePairs
ManyStringAdder(ref stringDupesList, 20);

//Initialise the Parallel sorter
Parallel<KeyValuePair<int, string>> parallelDo = 
     new Parallel<KeyValuePair<int, string>>(new KeyValueComparer<int, string>()); 

// And now you can sort using one of the 3 supplied 
// Sort using MergeSort. This one is a stable sort
stringDupesList = parallelDo.MergeSort(stringDupesList));

// This is the sort provided by Microsoft. 
// I've assumed it to be a derivative of QuickSort
stringDupesList = parallelDo.MSQuickSort(stringDupesList));

// This is my attempt at using the Logic from FastQuickSort with MergeSort. Not so good.
stringDupesList = parallelDo.MergeInsertionSort(stringDupesList));        

Points of Interest

Building a solution that's worthy of being published as an article requires quite an effort. I had already benchmarked my sort with a console application, but when you need to show your results to others you have to be more formal and certain. For unit testing, I had already settled on Visual Studio 2008 but for micro benchmarking I had to do some research.

Micro-benchmarking

After a few days of rummaging around the interweb, wondering whether I should translate my own JavaScript profiler and checking to see if profiling tools could benchmark to the degree I require, I ended up with three 3rd party candidates.

  • Nperf is a neglected but cool tool that provides graphs. Maybe part of Gallio ?
  • Ntime provides an nUnit interface, is feature full and was easy to implement
  • Microbenchmarking by Jon Skeet is C# source that you compile in your class

I went with NTime because I had little time to figure out the other candidates. Jon Skeet's tool provides a subset of the functionality in NTime and NPerf was temperamental to say the least. I still think I should translate my JavaScript profiler because it outputs to XML, ready to graph. But that's for another day.

Testing

The magic here is in what to test. I knew that my sort was running in production but I had never tested some obvious combinations of data types, values and thread count because my application never went near them. But, if you want other people to run your software, you are going to have to think about more uses for your code and test it. In the end, I spent half a day on the sort class and more than a week writing the tests and benchmarking in the hopes that I'll be able to re-use some of it in the future.

Thanks

In no particular order to:

  • NTime (by AdamSlosarski) for saving my time
  • Nick Taylor for being a sounding board
  • Kevin Fleming for inspiring usage of sound practices that I sometimes use
  • My wife for going with the flow
  • Ron Everett for not saying no when I said I wanted to write this article
  • Everyone else that I know

History

  • 2009-03-05
    • First release of Jigger.Parallel<T>.Sort

License

This article, along with any associated source code and files, is licensed under The Apache License, Version 2.0

About the Author

Jason Rudland
Architect Relavance
Canada Canada
Member
No Biography provided

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
GeneralTried CCR -- Microsoft's Concurrency and Coordination Runtimememberjboarman9 Mar '09 - 11:02 
Have you tried using Microsoft's relatively new library that dramatically simplifies concurrency issues? They have a free library available in the express edition of the robotics toolkit. They claim it will be included natively within the .NET framework in the future, presumably with VS 2010. For now, it's a separate library.
 
Intro to CCR:
http://msdn.microsoft.com/en-us/library/bb648752.aspx[^]
 
There's also a decent CCR video tutorial on Channel 9.
 
CCR included in Robotics Developer Studio 2008 Express Edition:
http://www.microsoft.com/downloads/details.aspx?familyid=84c5b49f-0f9c-4182-a267-a951328d3fbd&displaylang=en[^]
GeneralRe: Tried CCR -- Microsoft's Concurrency and Coordination RuntimememberJason Rudland11 Mar '09 - 16:45 
Not yet. CCR has been pointed out to me before, so the concepts are sitting in a holding pattern in my brain till I forsee they will fill a need. Thanks for the links. I did go and have a look again. It's kind of like Micro SOA.
 
Arrrgh you've got me thinking.
QuestionGuarantee of split to cores?memberzaf5 Mar '09 - 20:38 
It seems from the code that the application just creates the threads and the operating system takes care of assigning them to different cores. Does this mean that that there is no guarantee of this split, for example if the machine is running other processes as well? Would there be a way (without the use of special libraries) of assigning a thread directly to a core?
AnswerRe: Guarantee of split to cores?memberAWdrius5 Mar '09 - 23:07 
You will have to do some interop to achieve such a feature http://msdn.microsoft.com/en-us/library/ms686247(VS.85).aspx[^]
An interesting discussion on this at http://social.msdn.microsoft.com/forums/en-US/csharpgeneral/thread/a55733db-cac0-4ccb-a3cc-a584742b41f9/[^]
 
Or you can put the code, that you require to run on on one physical code without switching to another core or processor, inside ThreadAffinity block - http://msdn.microsoft.com/en-us/library/system.threading.thread.beginthreadaffinity.aspx[^]
GeneralRe: Guarantee of split to cores?memberzaf6 Mar '09 - 1:48 
Thanks very much for the info.
 
I followed the links and tested interop and it is very impressive. In a test app I made for a simple repeated multiplication splitting to two cores almost halves the time (with nothing else running).
GeneralRe: Guarantee of split to cores?memberJason Rudland6 Mar '09 - 3:24 
Hey zaf,
 
I'm glad AWdrius answered your question, but I'd like to contribute a too.
 
I must admit that I had mutual concerns, but I wrote the sort first and ran my tests. And lo and behold, all the tests I have run show extended periods of 100% CPU usage. Does that justify not bothering with thread affinity?
...
OK I just read one of AWDrius's links and I concur with Peter Ritchie. To paraphrase: Somethings are best left to be managed by the little black box.
 
My question is. Why did I have to do this in the first place? Shouldn't the Array.Sort be auto parallel?
 
my 2 cents
Jason
GeneralRe: Guarantee of split to cores?memberzaf6 Mar '09 - 8:53 
"My question is. Why did I have to do this in the first place? Shouldn't the Array.Sort be auto parallel?"
 
Indeed. In fact I would expect parallelism not only to be incorporated in libraries for applications to execute locally, but also to be offered, subject to pricing to a user selected extent (as in number of "cores"), in cloud technologies such as Azure.
 
I could also see a future where applications can utilize a combination of local and cloud cores for processing. This way the cloud would truly be a grid, equivalent to electricity, spending CPU power and having few worries about sensitive information and code residing at a vendor's premises. In such a scenario I you choose to you could have a big trusted vendor such as MS running your applications and you could have a selection of power utilities also available for executing.
GeneralRe: Guarantee of split to cores?memberJasonShort9 Mar '09 - 11:07 
Of course MS says that Dot Net 4 will include much more ability to run on multi core machines...
 
And I have read a few things that suggested you will finally be able to set processor affinity (tell it which core to run on ) in managed code without resorting to dll imports.
 
Threading is difficult, and I don't expect the Dot Net 4 libs to help it that much. Look at how poorly some of the locking mechanisms perform. You can usually gain serious performance only through domain knowledge. Any generic API will have a difficulty with knowing what you intended.
 
I expect that Array.Sort or some other synchronous objects may be able to take advantage of multicore in the next release. But for most operations you will still have to tune and tweak.
 
BTW, good job on the article.
 
Jason S Short, Ph.D.
VistaDB Software, Inc.

GeneralRe: Guarantee of split to cores?memberJason Rudland11 Mar '09 - 16:59 
Thanks Doctor Short.
 
I was thinking of writing up two more articles that culminate in a multi-core optimized multi-dimension indexable dictionary that accomodates duplicate keys with stable sorting. I only mention it because sometimes its just simple kudos that inspire.
GeneralRe: Guarantee of split to cores?memberJasonShort12 Mar '09 - 11:08 
Multi threading is one of those topics that .Net still needs a lot of coverage. Most people take the threading for granted. Just because it is easy to create a thread now does not mean all the issues around having multiple threads have gone away.
 
Always happy to review anything you want to post when I have time. It takes a level of effort to write up something and post it in a clear format. It is a lot harder than most people think. Way too easy to "vote" on an article, not so easy to write them. Kudos to you for taking the time, and for thinking of others. It is these types of posts that help CodeProject (and indeed the entire .net community) in the long wrong. Keep on coding and posting!
 
Jason S Short, Ph.D.
VistaDB Software, Inc.

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web02 | 2.6.130516.1 | Last Updated 5 Mar 2009
Article Copyright 2009 by Jason Rudland
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid