Parallel programming requires an easy way to tell if a user has multiple CPUs
Now that most new computers have multiple processors, we are seeing a rapid rise in the usage of parallel programming. Many libraries already supply efficient code for computers with multiple core processors, but how can you tell if the computer you are running on indeed has more than one CPU?
Why is the need to identify multiple CPUs becoming a pressing issue now?
Most programs today are multithreaded. More and more libraries are written for effective multithreaded code, with the assumption that this code will be run on computers with multiple cores. The need to know whether the computer is multi-core is a must, but in certain cases, like web applications for instance, you can’t “ask” the Operating System how many cores it is using outright.
How is this solved “normally”?
When running in the .NET environment, you can easily get the exact number of processors by using the following:
The bad news is that this property isn’t available for situations where you don’t have permission to ask the Operating System about the number of processors the system is running. This is exactly the situation you are in when you are developing web pages and RIAs based on MS Silverlight or Adobe Flash.
How can multiple CPUs be identified in a Silverlight application, or similar RIA situation?
I wrote the following code for determining whether a machine is a multi-core using only the basic operations available in any language and on any platform. The code runs two threads in parallel, and checks if they can really run at the same time.
SpinWait - The magic key
Roughly defined, a thread’s ‘Quanta’ is the minimum time that the Operating System will dedicate to the thread without making a context switch. It’s usually somewhere between 10 and 15 milliseconds. The code I wrote keeps the processors busy for the amount of instructions given, like creating an empty loop with an increased index; however, unlike ‘
Sleep’, this code prevents context switching from the thread, unless it spins in excess of the thread’s quanta.
Here is the code I use:
bool moreThanOne = false;
bool toContinue = true;
ManualResetEvent m = new ManualResetEvent(false);
Thread t = new Thread(new ThreadStart(delegate
moreThanOne = true;
t.IsBackground = false;
moreThanOne = false;
toContinue = false;
Note: The method described here isn't deterministic as the OS could demand a context switch due to a system interrupt, or for some other reasons, in which case, the code might report that the computer has more than one core, despite the fact that there is only one core.
The code opens a thread dedicated to repeatedly setting a boolean flag, with the idea of having this thread run on a secondary core, if one exists. In order to ensure that the thread has started and is ready to loop, a ‘wait’ handle is used. The main thread waits for the secondary thread to start, and then calls ‘
Thread.Sleep(0)’ to force the context switch. This ensures the next instruction will execute at the beginning of the thread’s quanta. In order to allow the other thread to set the flag on the other core while we are still spinning, we ‘
SpinWait’ 100000 instructions. Since the ‘
SpinWait’ is shorter than the thread’s quanta, it won’t force a context switch. Now, we can examine our outcome: If the flag has been set during the elapsed time, we know it must have been set by another processor that processed the thread while the other process was busy spinning.
Implementation and usage
I developed the code presented above while I was researching the performance of our http://www.headup.com application. I had discovered we were wasting a lot of time on mandatory locks that, although required, didn’t justify the accrued overhead.
Spinlock – good value, low cost
After some research, I decided to use the ‘SpinLock’ implementation, which utilizes the computer’s two cores to make an efficient lock, in order to boost our application’s performance at, what turned out to be, a very low effort.
SpinLock’s secret is that instead of using a system call, the way ‘
lock’ does, it uses ‘
SpinWait()’ to let the other core release the lock. Of course, if the computer hasn’t got another core, this is useless. So, in order to know whether I could use ‘
SpinLock’ effectively, I first needed to validate that the machine I was running had more than one processor. Hence the code above…
Examples and notes
- There are several "SpinLock" implementations, I found this one to be particularly nice: http://www.bluebytesoftware.com/blog/PermaLink,guid,fd6ed0d7-2849-4ca1-9619-74cd5713c3c0.aspx
- Test my processor detector live for yourself at http://yogil.com/yogi/processors.html. You can see how many processors your computer is running manually by clicking ‘Properties’ on ‘My Computer’ (Windows machines).
Ideas for the future
- Because of the risk of false positives, it would probably be advisable to run the method several times and use the most frequent outcome. This will certainly improve results.
- By using multiple threads, it should be possible to discover the exact amount of processors, and not merely whether the machine has one or more.