Click here to Skip to main content
15,074,377 members
Please Sign up or sign in to vote.
1.00/5 (2 votes)
I have to deal with a vector containing around 30 million elements, as a result it will return another vector containing around 55 million elements. 

I decide to go with multithreading. I created a few threads and each thread will handle a portion of the vector, store its sub-result into a temp vector. Finally when all threads finish, combine all the temp vector into a big vector, that is the final result. There is no mutex/lock used at all. Also I am very careful with the vector. I use reserve whever possible to avoid the reallocation.

I have tried run the program with 1 thread, 2 threads, 4 threads and 8 threads. 1 thread gives the best result. I am so confused. Can anyone tell me why?

Btw, all the operations happens in the ram and no IO. I am running the program on my laptop with 4 cores.


What I have tried:

I have tried different number of threads. none of them show an increase of speed. 1 thread looks the fastest.
Posted
Updated 26-Aug-21 1:30am

Maybe your threads run on a single core. See, for instance multithreading - C++ run threads on different cores - Stack Overflow[^].
   
We can't help you based on so little information: but from what I read, std::Vector is thread safe* - which probably means that it has internal locking which would seem to confirm the results you are getting. If the methods include internal locking of the whole vector, then multiple threads will just slow things down since only one of them can be active at any time.
Additionally, the number of threads created can slow the machine as well - if it exceeds the "free cores" then the thread switching overhead can make a marked difference.

I'd think about what you are processing, and see if you could separate it into smaller Vectors each of which could be locked independently and test that to see if it confirms the problem.

* Using Data Structures Safely with Threads[^]
* c++ - Is std::vector or boost::vector thread safe? - Stack Overflow[^]
   
Every thread needs some overhead and system resources, so it also slows down performance. Another problem is the so called "thread explosion" when MORE thread are created than cores are on the system. And some waiting threads for the end of other threads is the worst idea for multi threading.

The best strategy is to create some long running threads, on which some complete work load is enqueed and the results get dequeed. I would start with an UI thread and a work thread for the vector computation.

tip: check that you compile for the multithreading in the compiler and linker settings
   

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)




CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900