Click here to Skip to main content
15,861,172 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
I am processing a very large set of files. To do this I split the work up in +/- 500k tasks and wrote a little piece of code to iterate through and collect from these tasks.

C#
var threads = 16;

for (int i = 0; i < task_amt; i++)
{	
	taskList.Add(new Task<double>(() =>
	{
		// heavy computation code	
		return result;
	}));	
}

for (int j = 0; j < taskList.Count - threads; j += threads)
{	
	for (int k = 0; k < threads; k++)
	{
		Console.WriteLine($"Starting task {j + k}.");
		taskList[j + k].Start();

	}
	for (int k = 0; k < threads; k++)
	{
		Console.WriteLine($"Waiting for task {j + k}.");
		taskList[j + k].Wait();
		var taskResult = taskList[j + k].Result;		
	}
}


This works like a dream, it utilizes all the logical cores, they are at 100% and this is CPU-heavy so the other resources are negligible low.

But after about 15 mins it looks like it's slowing down, and after 30 mins the CPU is utilized only 50%, and after 1.5 hour 30%(all other resources still negligible low). This negatively influences the total calculation time severely, can anyone tell me why this happends?

UPDATE:

After 14 hours barely any CPU usage visible, process was gonna take about 1.5 hours at full CPU usage.

What I have tried:

The only reason I can think of is thermal throtteling, I have tried finding out if my CPU is using thermal throttle to cool down, unfortunately the hardware monitoring applications I used did not show me the CPU temp.
Posted
Updated 6-Mar-23 21:10pm
v6

Here is how I would approach it...
C#
int taskCount = 1000 ;
int completedCount = 0;

Random rand = new Random();

Task[] tasks = new Task[taskCount];

for (int i = 0; i < taskCount; i++)
{
    tasks[i] = DoLongTask(5000 + rand.Next(1000, 3000));
}


while (completedCount < taskCount)
{
    Task completed = await Task.WhenAny(tasks);
    completedCount = tasks.Count(x => x.IsCompleted || x.IsFaulted || x.IsCanceled);
}

Console.WriteLine("Done!");
Console.ReadKey();

async Task DoLongTask(int delay)
{
    // yield for bulk starting of tasks
    await Task.Yield();

    // do a long async task here
    await Task.Delay(delay);
}
 
Share this answer
 
Comments
Milchenka 7-Mar-23 3:06am    
Thank you for your suggestion, I'll check it out! I tried it, but unfortunately when I run this code, with my computations instead of the line "await Task.Delay(delay);" it calculates the first task succesfully and then just sits there doing nothing not starting the other tasks ("waiting for activation"). Sorry I rewrote my comment before I saw your reply.
Graeme_Grant 7-Mar-23 4:55am    
This is an oversimplification of how I use it, but it is a good starting point for you.

It uses the ThreadPool, so manages the threads for you. Also, all tasks will encapsulate the execution result. So, if there is a failure, you can check why. I check for all execution result states when I get the completedCount.

When you use async/await, there is no guarantee that the method you call will actually run asynchronously. The internal implementation is free to return using a completely synchronous path. So, if you're making an API where it's critical that you don't block and run some code asynchronously, and there's a chance that the called method will run synchronously (effectively blocking), using await Task.Yield() will force your method to be asynchronous, and return control at that point. The rest of the code will execute at a later time (at which point, it still may run synchronously) in the current context. So in your case, where you need to start a number of jobs quickly, and wait, Task.Yield() will return quickly and allow starting the next job quickly.
Creating "cold" tasks and managing the threads yourself seems like a bad idea. Why not create "hot" tasks and let the thread pool take care of scheduling them? Or use Parallel.For or Parallel.ForEach instead?

With your current code, unless every task takes exactly the same length of time to complete, you will be "wasting" cores. Consider: You have ten tasks to run. You start five tasks. Four of them complete in one minute; the fifth takes three minutes. For two minutes, only one task is running; the other five pending tasks have not started.
 
Share this answer
 
Comments
Milchenka 6-Mar-23 6:34am    
Thank you for your reaction, I will look into Parallel.For/Each, never heard of this before :P
By "Hot", do you mean starting all the tasks as I make them and then waiting for all the tasks sequentionally, instead of the batches that I'm using?
The reason I do the tasks in batches of 16 is to prevent switching between tasks (I hope) and to make sure that there will be tiny gaps where my CPU can breathe and do things like taskmngr and other windows stuff.. The tasks are identical with very nearly identical computation times.
I tried using the threadpool a few times but the examples I found didnt match my case and failed, hence this approach.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900