Click here to Skip to main content
16,015,218 members
Please Sign up or sign in to vote.
1.00/5 (1 vote)
Hi All,

I have billions of records and i have to do some long calculation on those records. If I am using a sequential for loop then it takes around 7hr to complete that task which is not affordable. To overcome with this, I’m using Parallel.For but the catch is concurrency is occurred in jobs.

Example -

1. Run Parallel.For on 10 records, it creates 10 threads for each record.

2. While executions, First Thread is done with all the calculation and clear the data.

3. Here Second or Reaming Threads getting error in the calculation says - data in not available


How can we resolve this concurrency probleam?

The ultimate goal of my code is to create a new instance for that record and run separately in these block and should not affect the others.
Posted
Updated 14-May-15 2:05am
v2
Comments
Mehdi Gholam 14-May-15 8:05am    
Your bottle neck is probably the record reading, and in any case find out where first, before going implementing technology.
anurag.netdeveloper 14-May-15 8:11am    
Hi Mehdi,

Records reading is not the issue. issue is in threading, there is a one single process and multiple threads and once the thread is done with its calculation job im clearing the data for the next record.
Mehdi Gholam 14-May-15 8:15am    
Are your records isolated or dependent on each other?
anurag.netdeveloper 14-May-15 8:19am    
No. Each record having a seprate calculation.
Frankie-C 14-May-15 9:04am    
What you mean with "When first Thread is done with all the calculation clears the data."?
Which data are you cleaning? Is there any static or common data that you clear? Do you allocate per-thread local data using TLS storage?

1 solution

You need to isolate thread local data for each concurrent threads.
First of all read about TLS Thread Local Storage[^], then figure out how to use them in your project.
For samples read this[^], or google for "C# thread local storage example".

The Parallel.For Method exist also in a flavour with thread local storage. See[^].
 
Share this answer
 
v2
Comments
anurag.netdeveloper 14-May-15 10:59am    
Thanks Frankie.
But a quick qus here, Will this TLS use the all cores of my machine?

Coz Parallel.For having the inerrability to create threads on ideal cores that makes the execution faster. Please correct me if i am wrong.
Frankie-C 14-May-15 11:21am    
Anurag, have read the links? TLS have to be used to create variables local to each thread, not to create threads.
You have to use one of the Parallel.For<tlocal> Methods. See https://msdn.microsoft.com/en-us/library/dd783299(v=vs.110).aspx
anurag.netdeveloper 15-May-15 4:45am    
I did Frankie. Here is an another thing is happing, when i use this local threads logic then i can see that the threads running on separate cores that is fine there is a catch in that.
step 1. Run Parallel.For(Local) on 10 records, it creates 10 threads for each record.
Step 2. 1st thread enter into the loop and take the Input values from DB for some calculation from Mathod1().
Step 3. Another thread is come into that loop and replace the previous thread Input values with new one in Mathod1() and continue with the same process till the end of loop.
Step 4. Mathod2() gets the latest input values of Mathod1() and make the calculation for rest of the records. Which comes to wrong result.



I have to make the separate instance of those methods(method1(),method2()….methodN()) by using this TPL.
Please suggest the solution for this kind of operations. Thanks in advance
Frankie-C 15-May-15 5:49am    
Dear Anurag, the implementation of parallel execution, or better defined as concurrent execution, is not as simple unfortunately. I had the wrong impression that you were a little bit more skilled, and from your answer I deduced that the problem could simply be related to variables to handle locally (as with thread local storage), but now I think that it will be a not easy to give you a concise and specific answer.
I can only give you some guidelines for concurrent tasking (multithread objects access).
First of all consider that all your threads will access variables, values, etc in any time and any order. This means that you cannot assume the flow of execution of your functions as sequencial and finite. You have to expect that a variable or whatever object can change unexpectedly.
What you have to do: analyze your software flow to avoid uncontrolled access to shared objects (fence the access using sync, mutex or the like), and check that local or temp variables or objects are really local and cannot be altered from other threads.
But the most important point is: your software can really make parallel operations? Because if a computation depends on the previous one it mast be serially executed (i.e. if you compute a new value and use it for the next computation you have a naturally serial execution flow and simply cannot make it parallel!).

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900