|
Nemanja Trifunovic wrote: if there are multiple people introducing changes it becomes much harder
We, as a team, seem to have a good combination of division of labor and cooperation. We've not had very many thread synchronization issues that have arisen because of multiple people being involved in a single code base.
|
|
|
|
|
I too have been doing this for a long time. One reason is my home machine has been at least 2 cores since I had my first dual processor pentium 1 board in 1995. I started the paying job in 1997 and some of my first projects required a multithreaded design. 13 years later I am still doing multithreaded designs but for the most part not as low level as I started. I tend to now use thread pools more often and other libraries that do most of the heavy lifting for me.
John
|
|
|
|
|
John M. Drescher wrote: I tend to now use thread pools more often and other libraries that do most of the heavy lifting for me.
I'm enjoying that feature of .NET. I've got a significantly multi-threaded application in C# I've been working on for the last 18 months, and just recently I added the first 'real' thread. The rest of it has been through the BeginInvoke /EndInvoke constructs, which use the thread pool under the covers for you.
|
|
|
|
|
I have to disagree (does that label me "cargo cult" now?)
It's ok if it's your bread and butter, and you live in an environment where multithreading is everywhere.
However, adding multithreading to a pool of techniques and skills is hard. It has so many side effects and artifacts, e.g. it affects interface design, as this determines whether external locking is necessary, and preventing deadlocks requires global knowledge of the application. That's a lot of burder for libraries with "Unknown" reuse.
|
|
|
|
|
peterchen wrote: (does that label me "cargo cult" now?)
Certainly not.
peterchen wrote: That's a lot of burden for libraries with "Unknown" reuse.
Agreed. I believe most library authors shouldn't bother with multithreading concerns, unless it's an expected attribute of the environment in which the library is going to be used. It's nice when the author documents any threading concerns for use in that environment.
As a matter of fact, when an author makes a library thread-safe it can actually make it more difficult to use it in a multithreaded environment, since the user no longer has control over the thread synchronization mechanisms used.
Software Zen: delete this;
|
|
|
|
|
I discovered that MFC doesn't support parallel code. I had some C++ code that was fully decoupled, no sharing of memory, etc., but it was using Microsoft's STL, and the performance was actually WORSE on a 4 core machine (real cores) than on a single core machine. I ended up launching each task a separate PROCESS and voila, I suddenly achieved 100% CPU utilization. Pathetic, in my opinion, that a "modern" language backed by a "modern" framework doesn't actually work in a multicore machine. I have yet to try something similar with C#, I assume it's not backed by the same memory management schemes.
Marc
|
|
|
|
|
MFC or STL ???
2 bugs found.
> recompile ...
65534 bugs found.
|
|
|
|
|
To be honest Marc, I would have expected you to track down the bottleneck and post the reasons here.
|
|
|
|
|
Andre xxxxxxx wrote: To be honest Marc, I would have expected you to track down the bottleneck and post the reasons here.
I did, as much as I needed to for the time investment--Microsoft's STL and alloc's that it was doing through MFC. A simple threaded app test confirmed that was the problem. And actually, I posted about this about 8 months ago when I first was trying to figure out the problem.
Marc
|
|
|
|
|
Marc Clifton wrote: but it was using Microsoft's STL
well.... I use vectors in threaded operations all the time. your lists and trees will bog down in threading, but a vector will only kill you on an expansion. I control the expansion, thus preventing any issues with STL vectors, making them the easiest and fastest to use. I have used some of the others, but you really have to know how STL is storing and accessing the information to know what you have to do to use it in parallel.
_________________________
John Andrew Holmes "It is well to remember that the entire universe, with one trifling exception, is composed of others."
Shhhhh.... I am not really here. I am a figment of your imagination.... I am still in my cave so this must be an illusion....
|
|
|
|
|
El Corazon wrote: but a vector will only kill you on an expansion.
Well, that's exactly what was going on. And unfortunately, as this was an analysis routine that analyzes switch topologies for failure conditions, the modus operandi of the algorithm is expanding various vectors, maps, etc.
Marc
|
|
|
|
|
Marc Clifton wrote: as this was an analysis routine that analyzes switch topologies for failure conditions, the modus operandi of the algorithm is expanding various vectors, maps, etc.
not enough memory to reserve before hand?
_________________________
John Andrew Holmes "It is well to remember that the entire universe, with one trifling exception, is composed of others."
Shhhhh.... I am not really here. I am a figment of your imagination.... I am still in my cave so this must be an illusion....
|
|
|
|
|
El Corazon wrote: not enough memory to reserve before hand?
Ah, the problem is, it's impossible to figure out before hand, though ballpark estimates would definitely be doable. I'll have to look into that.
Marc
|
|
|
|
|
Marc Clifton wrote: Ah, the problem is, it's impossible to figure out before hand, though ballpark estimates would definitely be doable. I'll have to look into that.
if you have the memory, ball-park estimates and even over estimates will help. Most STL implementations add 50% more storage when you run out of reserve storage (though a few double). So if you can get past most of the re-allocs on the small end you may only realloc a few times... and I know it is a sin to use more memory than you need... but ... if you must get it done fast, over-estimate if you have the memory to spare.
_________________________
John Andrew Holmes "It is well to remember that the entire universe, with one trifling exception, is composed of others."
Shhhhh.... I am not really here. I am a figment of your imagination.... I am still in my cave so this must be an illusion....
|
|
|
|
|
El Corazon wrote: and I know it is a sin to use more memory than you need... but ... if you must get it done fast, over-estimate if you have the memory to spare.
That doesn't bother me at all. The number of combinations of failure cases that have to be analyzed are in the billions, so there's no way to hold all that in memory anyways, but there's a lot of state information that does fit in memory easily, and of course every iteration of a failure analysis has to clear various working lists. So thank you, I'll have to give this a try!
Marc
|
|
|
|
|
C++ isn't modern, and neither is MFC.
OTOH, I've used threading without problems in MFC apps - the one thign you should not try on Windows anyway is a multithreaded UI.
|
|
|
|
|
I am using the most primitive and most horrible way of implementing parallelism: manual thread creation/synchronization/communication. Have been doing it for 10 years now and the more I know the more I dislike it.
|
|
|
|
|
I prefer GPU instead of CPU for parallel operations. My favorite is NVIDIA CUDA and its much much faster. 
|
|
|
|
|
Is there a library that wraps the whole thing using CPU / CPU SIMD implementation if no CUDA-supporting card is found, or do you have to require a newer NVIDIA card / code it on your own?
|
|
|
|
|
AFAIK: CUDA is strictly for nvidia GPU's only. Earlier versions supported emulators for non-nvidia devices, eliminated completely in v3.0. For using it on others, OpenCL is there (There's both CPU/GPU SIMD implementations). But I prefer CUDA because its faster even than OpenCL and very similar to OpenCL syntax. On top of it NVIDIA/ATI/(to some extent on S3G) has OpenCL support. Moreover nvidia customers as compared to ATI are more. Though last 6 months ATI customer base has shown excellent increase.
To Run CUDA, nvidia 8-Series and above cards are required. I work with GeForce 8500GT (With Compute Capability 1.1) which is very low end card and the performance is little more than my Intel Core 2 Duo E8400 @3.00GHz.
modified on Monday, July 5, 2010 6:59 AM
|
|
|
|
|
peterchen wrote: Is there a library that wraps the whole thing using CPU / CPU SIMD implementation if no CUDA-supporting card is found, or do you have to require a newer NVIDIA card / code it on your own?
Get Thrust, and use the OpenMP back-end. Unfortunately that is compile time, I was talking with one of the developers during an nVidia class on Thrust and he is considering doing a run-time version. Certainly you can setup one call to an OpenMP routine and another to a CUDA, but that means double coding. When Thrust gets the runtime OpenMP/CUDA working, it will be much easier.
http://code.google.com/p/thrust/wiki/DeviceBackends[^]
There are also projects that handle multiple backends like Ocelot[^]
Cuda, in and of itself, can only be compiled in emulation mode to run on a CPU, but that limits what libraries and calls you can make.
_________________________
John Andrew Holmes "It is well to remember that the entire universe, with one trifling exception, is composed of others."
Shhhhh.... I am not really here. I am a figment of your imagination.... I am still in my cave so this must be an illusion....
|
|
|
|
|
I don't know if people realise this, but only certain tasks are suitable for parellelism. Specifically, tasks that can be split into completely seperate units with no data sharing between threads. Once you have data shared by several threads, then you have thread synchonisation issues and you are in store for some *pain* 
|
|
|
|
|
I don't know how this is with other platforms, but on Windows the most common reason for parallelism is not performance, but keeping the UI active (not sure if this counts as "strict" parallelism).
Also, when writing libraries to be consumed by others, you often have to design for parallelism, whether it gets used or not.
And yes, it's a pain.
|
|
|
|
|
I think this poll is specifically about performance. Of coarse, there are various other reasons why you might need multi-threading
|
|
|
|
|
I write image analysis software and parallel processing is an absolute must. BUT you better be careful - coding by the seat of your pants is asking for trouble. You have to spend some time designing how it's going to work, otherwise it probably won't, or (worse) it will work sometimes, or (even worse) it will work slightly differently each time. Yes, I have been reduced to swearing at my own software. Some careful planning and it shouldn't be too painful, just tedious. But you get to watch every core running flat out which is pretty funny. Hah! Who's making who work now?
|
|
|
|