Click here to Skip to main content
15,890,438 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Quick question for people used to Parallel, I have the following piece of code, nothing fancy:

C#
var result = Parallel.For(0, 10, (i) =>
    {
        Console.WriteLine(DateTime.Now.ToLongTimeString() + "Start sleep: " + i);
        Thread.Sleep(i * 1000);
        Console.WriteLine(DateTime.Now.ToLongTimeString() + "Done sleep: " + i);
    });

Console.WriteLine("DONE!");


And I sort of expected it to take around 9 seconds, maybe a little more depending on the scheduler. But it took anywhere from 12 to 25+ seconds Can someone explain why?

Here's a sample output taking 20s and my comments. Why does the scheduler wait for such a long time before launching the last one? There was a lot of time to launch it. Also, sometimes it is done nicely and quickly (earlier)... why the huge variance? Not much changing on my side from run to run.
5:10:24 PMStart sleep: 5 --> 5
5:10:24 PMStart sleep: 0 --> 5 0
5:10:24 PMDone sleep: 0 --> 5
5:10:24 PMStart sleep: 1 --> 5 1
5:10:24 PMStart sleep: 3 --> 5 1 3
5:10:25 PMStart sleep: 6 --> 5 1 3 6
5:10:25 PMDone sleep: 1 --> 5 3 6
5:10:25 PMStart sleep: 2 --> 5 3 6 2
5:10:26 PMStart sleep: 4 --> 5 3 6 2 4
5:10:27 PMStart sleep: 7 --> 5 3 6 2 4 7
5:10:27 PMDone sleep: 2 --> 5 3 6 4 7
5:10:27 PMStart sleep: 8 --> 5 3 6 4 7 8
5:10:27 PMDone sleep: 3 --> 5 6 4 7 8
5:10:29 PMDone sleep: 5 --> 6 4 7 8
5:10:30 PMDone sleep: 4 --> 6 7 8
5:10:31 PMDone sleep: 6 --> 7 8
5:10:34 PMDone sleep: 7 --> 8
5:10:35 PMDone sleep: 8 --> EMPTY
5:10:35 PMStart sleep: 9 --> 9
5:10:44 PMDone sleep: 9 --> EMPTY
DONE!
Posted
Comments
Sergey Alexandrovich Kryukov 24-Oct-11 17:41pm    
Do you use CPU with multiple cores?
--SA

I think the majority of the discrepancy is from the fact that you're queueing work items to be done on the thread pool vs. starting dedicated threads. In the one method you create dedicated threads for each item that all run immediately but the ThreadPool doesn't work that way.

When you queue items to the pool it initially starts out with only so many threads and adds at most 2 a second (last I checked) if there are not enough threads to execute all of the work items in its queue. (The number or initial threads in the pool varies depending on the system specs.)

EDIT
shoulda' fact checked before posting instead of after :)

When queueing items the thread pool creates threads until it hits the "optimal number of threads" which varies based on the system specs. According to my reference it's the number of processors in the system - as of .Net 3.5. After it hits that limit it throttles new thread creation to at most one every half second. The exact algorithm for adding threads isn't something users of the ThreadPool should need to worry about, it's designed to be good enough in most cases.
/EDIT

Once those threads are created, the pool keeps them around for a bit in case it needs them again. When you call ExecuteTests in the loop the first iteration "primes" the ThreadPool for you; it creates all the necessary threads so that enough already exist to handle all the items being queued in subsequent calls. The first iteration takes more time because it starts out with too few threads to handle all the items and takes time to spin more up.
 
Share this answer
 
v2
No, expected non-parallel time is about 45 second, not counting overhead! Why do you multiply sleep time by index?

I got the same 45 sec on sequential code, some 18 to 27 sec in parallel code on 2-core system. This is absolutely not wonderful.

Also, I used better timing tool, using System.Diagnostics.Stopwatch:

C#
System.Diagnostics.Stopwatch watch1 = new System.Diagnostics.Stopwatch();
watch1.Start();
var result = Parallel.For(0, 10, (index) => {
        Console.WriteLine(DateTime.Now.ToLongTimeString() + "Start sleep: " + index);
        Thread.Sleep(index * 1000);
        Console.WriteLine(DateTime.Now.ToLongTimeString() + "Done sleep: " + index);
});
watch1.Stop();
Console.WriteLine("Total time: {0} seconds", watch1.Elapsed.TotalSeconds);

System.Diagnostics.Stopwatch watch2 = new System.Diagnostics.Stopwatch();
watch2.Start();
for (int index = 0; index < 10; index++) {
    Console.WriteLine(DateTime.Now.ToLongTimeString() + "Start sleep: " + index);
    Thread.Sleep(index * 1000);
    Console.WriteLine(DateTime.Now.ToLongTimeString() + "Done sleep: " + index);
}
watch2.Stop();
Console.WriteLine("Total time: {0} seconds", watch2.Elapsed.TotalSeconds);


If I remove multiplication by index, I get 9-10 sec on sequential code, and about 3 sec on parallel code.

Do you understand sleeping is pointless in such parallel code. I hope you have done only for experiment. Also, you should understand parallel executions does not always improve throughput. You need multi-code CPU or multiple CPU and a task which makes sense in parallel execution. Also, other processes can reduce the effect. With heavy CPU loading you may not get performance improvement.

—SA
 
Share this answer
 
Comments
Sergey Alexandrovich Kryukov 24-Oct-11 18:50pm    
OP commented:

Yeah, this is just for testing... completely pointless otherwise... same thing for multiplying the sleeps up top. No reason other than understanding things better.

I know for non parallel it's 45s, that's pretty obvious.

Here I expected it to be 9s+ because I figured all tasks (threads) start running fairly close to each other and expected time was longest + overhead/scheduling.

Really the question is why does the sleep(9s) doesn't start until so late? It could've started a lot earlier. Sometimes it does, sometimes it doesn't.

Other processes will affect the result, but although my machine had several processes up, they were all pretty idle. At the very least I would've expected consistency run to run, 18-27 is far from it.

Yeah, I have a dual core. But even on a single core, I would've expected a runtime of 9s+ since most of these tasks are really just in a waiting state, no work is being done. Does Thread.Sleep() affect anything outside the task's thread??
Sergey Alexandrovich Kryukov 24-Oct-11 18:56pm    
No, you cannot add to performance on a single core in principle. How? Each sleep can be executed anyway. You see, your sample is a pathological case due to this sleep, anyway. Sleep is designed to put a thread in a wait state. Parallel class is build on top of threads. With parallels, parallel tasks are distributed between "real" threads, and each thread is directed to go into sleep, no matter what...
In settings when CPU consuming code is so small (only your WriteLine calls are supposed to get some time), nearly all execution is reduced to overhead of threads and tasks. In such settings, its hard to evaluate timing correctly; you can do it only when overhead is insignificant to semantic code, which is not the case here.

Anyway, if you agree that this makes sense, please consider formally accepting my answer (green button) -- thanks.

--SA
rld1971 25-Oct-11 14:59pm    
Hmm... I think this has to do more with scheduling than whether work is done or not. (?) See below.
I changed my code a little to test several methods. My results tell a different story now... Tasks (or Parallel.For) turn out eventually, starting on the second run to give results closer to what I expected (9+ seconds run time). To me this is a scheduling question, maybe to do with initialization. Anybody knows?

Code
C#
private void RunTest()
{
    for (int i = 0; i < 5; i++)
        ExecuteTests();
}

private void ExecuteTests()
{
    Stopwatch sw = new Stopwatch();
    sw.Start();

    ExecParallel();
    Log("Parallel done", sw);
    ExecTasks();
    Log("Tasks done", sw);
    ExecThreads();
    Log("Threads done", sw);
    ExecThreadPool();
    Log("ThreadPool done", sw);
}

private static void ExecParallel()
{
    var result = Parallel.For(0, 10, (i) => { DoWork(i); });
}

private static void ExecTasks()
{
    var tasks = new Task[10];
    for (int i = 0 ; i < 10 ; i++)
    {
        tasks[i] = Task.Factory.StartNew(() => DoWork(i));
    }

    Task.WaitAll(tasks);
}

private static void ExecThreads()
{
    var threads = new Thread[10];
    for (int i = 0; i < 10; i++)
    {
        threads[i] = new Thread(() => DoWork(i));
        threads[i].Start();
    }

    foreach (var t in threads)
        t.Join();
}

private static void ExecThreadPool()
{
    var threads = new Thread[10];
    var doneEvents = new ManualResetEvent[10];

    for (int i = 0; i < 10; i++)
    {
        int j = i;
        doneEvents[j] = new ManualResetEvent(false);
        ThreadPool.QueueUserWorkItem((object ctx) =>
                        {
                            DoWork(j);
                            doneEvents[j].Set();
                        });
    }

    foreach (var e in doneEvents) e.WaitOne();
    // WaitHandle.WaitAll(doneEvents); // in STA thread...
}


private static void DoWork(int i)
{
    Thread.Sleep(i * 1000);
}

private void Log(string s, Stopwatch sw)
{
    Console.Out.WriteLine("[Elapsed time: {0:##.00}s] {1}", sw.ElapsedMilliseconds / 1000f, s);
    sw.Restart();
}


Results
[Elapsed time: 27.01s] Parallel done
[Elapsed time: 15.55s] Tasks done
[Elapsed time: 10.06s] Threads done
[Elapsed time: 9.00s] ThreadPool done

[Elapsed time: 9.00s] Parallel done
[Elapsed time: 10.00s] Tasks done
[Elapsed time: 10.05s] Threads done
[Elapsed time: 9.00s] ThreadPool done

[Elapsed time: 9.00s] Parallel done
[Elapsed time: 10.00s] Tasks done
[Elapsed time: 9.05s] Threads done
[Elapsed time: 9.00s] ThreadPool done

[Elapsed time: 9.00s] Parallel done
[Elapsed time: 10.00s] Tasks done
[Elapsed time: 10.06s] Threads done
[Elapsed time: 9.00s] ThreadPool done

[Elapsed time: 9.00s] Parallel done
[Elapsed time: 10.00s] Tasks done
[Elapsed time: 9.04s] Threads done
[Elapsed time: 9.00s] ThreadPool done
 
Share this answer
 
So this makes sense... maybe we don't know the exact parameters, but I sort of see the logic working now. I moved the ExecThreadPool() to be 1st and then the rest were taking as much time as expected. Always learn something new! Is there a better way to do this?
 
Share this answer
 
Comments
Jimmanuel 26-Oct-11 15:41pm    
>> Is there a better way to do this?

That depends on exactly what you want to do. The Thread Pool is used for work items that need to be processed but it doesn't matter exactly when as long as it's in the near future. Items should generally be short and not block so as to not waste the existence of the thread. Tasks and Parallel use the Thread Pool under the hood but give you a nicer wrapper to handle exceptions and waiting for things to complete.

Dedicated threads are more useful if you need to block a lot, execute something right now or need more fine grained control of the threads in your app.

I believe you said this was a contrived example (at least I hope it is) so what you need to use depends on what you're trying to accomplish.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900