Click here to Skip to main content
15,886,830 members
Please Sign up or sign in to vote.
5.00/5 (1 vote)
See more:
Hello Experts

Why multiply processes are more performant than one process with multiply threads for parallelization?

Background:
In my Application, I need to calculate a lot of matric inversions (SVD). This means more or less only, a lot of cpu power is needed.

Observation:
A. 1 process and 8 threads: taskmanager shows me about 80% cpu usage. I can easily control my pc.

B. 8 processes each 1 thread: cpu usage is 100% and one feels that pc is full at work. Controlling pc stagnates.
Does anybody has a hint for me why it is like this?

[EDIT]
The parallelized Tasks can easely prepeared in Memory. Disk Access is out of question.
[EDIT]

Thank you in advance.
Bruno

At present I observe this with
- Processor “i7 – 3770”
- All processes/tread with priority “normal”

Similar behavior I see with a Borland c++ tool twinecompile. 8 processes, cpu full consumed :)
Posted
Updated 29-Nov-14 5:57am
v3
Comments
KarstenK 28-Nov-14 13:24pm    
What looks it in the task manager? How are the core works?
Some sync between the threads with global memory or handles?
[no name] 28-Nov-14 14:35pm    
Thank you for your interest.

Task Manager: Overall cpu usage shows me this 80% cpu usage. For me 1 process 8 threads feels like "soft parallelization".

8 Processes/one thread does really consume "all of cpu", feels like "hard parallelization".

Sorry for this non technical description.

Snyc: No, really independend Tasks! No global Memory which Needs to be shared. Simply parallel permutation tasks.
BillWoodruff 28-Nov-14 14:09pm    
+5 I think this is a very interesting question to ask !
[no name] 28-Nov-14 14:42pm    
Thank you very much for your interest and of course for your 5.
I'm "lost" in this observation. I'm really asking me why it is like that. OK, I can Imaging that processes have some more priority over Threads. But over all, all my observations are not clear for me.
Especially if this should be the "truth", it becomes much more harder for me. Because then I Need to use IPC instead of simple therad synch :(
Thank you
Bruno
Philippe Mori 28-Nov-14 20:37pm    
On reason might be that using different process uses more memory. Does the single process application share the data between thread. If so and you are using a lot of memory, then it might be the reason.
I think that the answer is really dependant on what you do and how you do it.

There are a few reasons you may think of...
You have 4 cores so 8 processes require process-switching, where 8 threads are fit in your 8 processor threads.
In any case thread switching costs less than process switching...
Also threads of the same process share the same memory so can communicate with each other very fast...
What you can bear in mind when deciding what to use is, that treads are very good for small, but frequent jobs, where process is better for long running jobs...
I would not care that my PC seems to be 'locked' why the application runs if there is no need for IO, I would even welcome this if it means a better run time...
What you have to look for - IMHO - is how fast the computation done in overall and not how PC behaves while running. Why should you 'control' you PC in the process?
 
Share this answer
 
Comments
[no name] 30-Nov-14 7:50am    
Thank you very much for your Feedback.

My threads are completely independent, therefore ITC-communication/synch is not the bottleneck.
"PC seems to be locked": Exactly that I'm trying to reach, because then I'm sure that cpu power is full available for my calculations.
Kornfeld Eliyahu Peter 30-Nov-14 8:31am    
I'm not sure that 100% of CPU usage brings you better runtime than 80%. The 20% maybe wasted on side-processes...
There is no 'magical' answer of 'on-size-for-all', so you should do some benchmark on your variants.
[no name] 30-Nov-14 9:00am    
I did these benchmarks. One process multiplies threads I took as 100%.
Compared to this, multiply processes/each one thread shows me about 140% (+/- 10%). That is exactly the thing I do not understand. Maybe I think I need to try to assign specific cpu for threads.
Kornfeld Eliyahu Peter 30-Nov-14 9:06am    
I used the wrong word - benchmark... I meant to check which finishes first...
[no name] 30-Nov-14 9:10am    
Multiple processes finished definitively first. But as usual I think most probably I do something wrong...but I don't yet see what :)
In newer versions of Windows, on multicore systems, the system will by default prevent any single process from using all cores at the same time as a precaution against single (faulty) tasks freezing the entire system. A second process however is free to use any remaining processing power, so that is why running multiple processes will run at 100% CPU whereas a single process will not.

On a sidenote, I work a lot with matrices, but never really have a need for inversion. Any time I have a system of linear equations, I use a linear equation solver which is much faster (and more numerically stable!) than matrix inversion. I really can't think of any application that has a real need to invert lots of matrices to the point where this becomes a performance issue. Have you considered the question on whether you can avoid some or all of these inversions by just using linear solvers considering the fact that a linear solver takes at most O(n^2) whereas the typical inversion algorithm requires O(n^3)? (I say typical because more complex inversion algorithms can be better in O complexity even though they may be hard to implement and come with a big constant factor; see http://mathforum.org/library/drmath/view/51908.html[^] for some numbers and background info)
 
Share this answer
 
Comments
Jochen Arndt 1-Dec-14 3:43am    
+5.
Conclusion: It is not recommended to start multiple instances of one application because the system may freeze.
[no name] 1-Dec-14 14:05pm    
I think conclusion would be more: In case you like to have all cpu power, then use processes. Even meanwhile I found that the answer is not 100% right. There are some stress Tools which consumes whole cpu with only one process.
Stefan_Lang 2-Dec-14 2:07am    
What I described is the default behaviour. I did suspect it's possible to circumvent that, but wasn't sure about that, or how to achieve it.
Jochen Arndt 2-Dec-14 3:17am    
As Stefan said, it is the default behaviour. One reason for the lower load may be core parking which is a power saving option. Searching for "Windows core parking" will present you some tools and registry settings.
nv3 1-Dec-14 4:18am    
5! Very good explanation. As you said, this behavior is by design and not accidental.

Bruno: You might want to try with the priority of your process. If you mark it as "high priority" the system might be willing to give it more CPU resource than to a usual background process.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900