Click here to Skip to main content
Click here to Skip to main content

Comparison between Different Methods to Iterate Over a List of Items

, 7 Mar 2011
Rate this:
Please Sign up or sign in to vote.
Comparison between different methods to iterate over a list of items and see which method is the most effective

Introduction

The purpose of this article is to examine different ways to iterate over a list of items, and see the comparative time it takes to iterate over the lists using either one thread or multi threads in different ways.

The envisioned use of such a loop is to price a set of trades, work out some complicated mathematical problems, download files from a series of remote machines or other such applications.

Visual application that need a multi threaded approach, e.g., a Winforms application that has a background worker thread, isn't covered by this article as the background thread is needed in this case to make sure that the interface stays responsive while doing some work in the background, e.g., copying large files.

The methods under examination are:

  1. A traditional for loop on a single thread
  2. Traditional for loop starting a new thread for each item
  3. Using Traditional task pooling
  4. Using the new task parallel library for loop

Background

The background for this article is that I looked on the internet for an article that would give me statistics on comparing how long each of the above methods would take to run instead of using my gut feel. Due to lack of such an article, I wrote a program to do it myself and decided to share.

Test for Thread Looping Test Class

The first thing that we need to do is write a test that will show how we are going to execute a test method in each of the four ways.

The test is located in the project called LoopExecution.Test.

The test is as follows:

1. [TestMethod]
2. public void Assert_That_Loops_Run()
3. {
4.     List<int> listOfInt = new List<int>();
5.     int numberOfTimesToRun = 32;
6.     for (int i = 0; i < numberOfTimesToRun; i++)
7.    {
8.         listOfInt.Add(i);
9.     }
10. 
11.     LoopExecution<int> loopExecution = new LoopExecution<int>();
12. 
13.     loopExecution.ExecutionMethod += 
	new LoopExecution<int>.MethodToExecute(loopExecution_ExecutionMethod);
14.     loopExecution.CollectionToIterateOver = listOfInt;
15. 
16.     loopExecution.Execute();
17. 
18.     Assert.IsTrue(loopExecution.NormalLoopTime > 0);
19.     Assert.IsTrue(loopExecution.ThreadLoopTime > 0);
20.     Assert.IsTrue(loopExecution.ThreadPoolLoopTime > 0);
21.     Assert.IsTrue(loopExecution.TaskLoopTime > 0);
22. }

The first part of the test (lines 4-9) creates the list we are going to iterative over. I've chosen to use a simple list of INT so as to make the test more easier to understand and simple.

On line 11, we create the object that is going to do the loop iteration. This is a generic type so that it can be used to test all different classes.

Line 13 assigns an event to the class that will be called by each iteration of the loop.

Below is the code for the method that is used in the test. It use the standard .NET event signature and all the method does is write a Debug message.

1. void loopExecution_ExecutionMethod
	(object sender, LoopExecution<int>.LoopExecutionEventArgs e)
2.         {
3.    Debug.WriteLine("{0}number{1}.Threadname:{2}.Objectoperatingon{3}",
	e.MethodName,e.LoopExecutionNumber,Thread.CurrentThread.ManagedThreadId,
	e.ObjectToOperateOn);
4. }

One interesting point to note is that as part of the event args class, I pass in the object that is for that iteration of the loop. This means that you can access that object to perform operations on.

On line 14 of the test, the list that will be iterated over is assigned to the looping class and then on line 16, we call execute.

To check that all of the different looping methods have been executed and the time it took to execute set for each method, at the end of the test I do an assert that the time for each one is greater than zero. The asserts don't have a specific value as on different machines and depending on what is running in the background you can get different results, however all the results should be greater then zero. The asserts could be improved by using the delegate to set a flag that for that particular method and iteration the method had been called.

Thread Looping Class

The main code for the execution of the Loop execution is located in the project called LoopExecution.Library.

Before we look at the actual main part of the class that runs the looping test, I am going to cover first the event that is fired by each of the methods.

1. public delegate void MethodToExecute(object sender, LoopExecutionEventArgs e);
2. public event MethodToExecute ExecutionMethod;
3. 
4. protected void OnExecuteMethod(object sender, LoopExecutionEventArgs e)
5. {
6.     if (ExecutionMethod != null)
7.     {
8.         ExecutionMethod(sender, e);
9.     }
10. }

This is a a pretty standard way of handling events in .NET so I won't go into any detail on this. The only interesting part is the Event args. This is defined as an inner class. The reason for this is that I wanted to make sure that it uses the same type defined in the Generic of the main class.

The Loop Execution Event Args class looks like:

1.  public class LoopExecutionEventArgs : EventArgs
2. {
3.     public String MethodName { get; set; }
4.     public T ObjectToOperateOn { get; set; }
5.     public int LoopExecutionNumber { get; set; }
6. }

On line 4, the type T is defied in the main class. This is just a normal event class with the additional info of the method that is executing it, the current object form the loop and the current position in the loop.

The Main Thread Looping Class

The class signature for the main class is:

1. public class LoopExecution<t>
2. {
3. …
4. }

This allows us to set what type of object we will be interacting with by use of the generic type.

Now we get to the crunch. The main method that will run each of our loops and give us the times for each loop.

1. public void Execute()
2. {
3.     NormalExecute();
4.     ThreadExecute();
5.     ThreadPoolExecute();
6.     TaskParallelExecute();
7. }

So nothing really exciting here. All this method does is call the four methods for each type of loop. This is formatted in a clean code way so you can see at a glance what the purpose of the method is.

Normal Loop

The first part to examine is the Normal loop. This consists of two methods which are as below:

1. private void NormalExecute()
2. {
3.     watch.Start();
4.     NormalWhileLoop();
5.     NormalLoopTime = watch.ElapsedMilliseconds;
6.     watch.Reset();
7. }
8. 
9. private void NormalWhileLoop()
10. {
11.     for (int i = 0; i < this.CollectionToIterateOver.Count; i++)
12.     {
13.         LoopExecutionEventArgs e = new LoopExecutionEventArgs();
14.         StackFrame stackFrame = new StackFrame();
15.         e.MethodName = stackFrame.GetMethod().ToString();
16.         e.LoopExecutionNumber = i;
17.         e.ObjectToOperateOn = CollectionToIterateOver[i];
18. 
19.         OnExecuteMethod(this, e);
20.     }
21. }

This should be fairly straight forward to understand. The first method used a field called watch which is just a System.Diagnostics.Stopwatch to mark the start and end of how long this loop takes to run.

The second method is just a stand loop that iterates over all the items in the list and using the current item, then creates a LoopExecutionEventArgs with the current method and iteration number. All the loop then does is execute the event callback via the OnExecuteMethod we saw above.

Thread Loop

The next part of the program is similar to the normal loop but instead of the event callback being executed on the same thread, a new thread is created and the event callback is run on that.

1. private void ThreadExecute()
2. {
3.     watch.Start();
4.     ThreadLoop();
5.     ThreadLoopTime = watch.ElapsedMilliseconds;
6.     watch.Reset();
7. }
8. 
9. private void ThreadLoop()
10. {
11. 
12.     for (int i = 0; i < this.CollectionToIterateOver.Count; i++)
13.     {
14.         Thread t = new Thread(x =>
15.         {
16.             LoopExecutionEventArgs e = new LoopExecutionEventArgs();
17.             StackFrame stackFrame = new StackFrame();
18.             e.MethodName = stackFrame.GetMethod().ToString();
19.             e.LoopExecutionNumber = (int)x;
20.             e.ObjectToOperateOn = CollectionToIterateOver[(int)x];
21.             OnExecuteMethod(this, e);
22.         });
23.         t.Start(i);
24.     }
25. }

As you can see, the first method is the same as the normal loops first method with the only differences being setting it for the ThreadLoop. The second method just uses an anonymous method which will call the event and then starts the thread.

Thread Pool

The next method makes use of a thread pool. The methods used for this are:

1. private void ThreadPoolExecute()
2. {
3.     watch.Start();
4.     ThreadPoolLoop();
5.     ThreadPoolLoopTime = watch.ElapsedMilliseconds;
6.     watch.Reset();
7. }
8. 
9. private void ThreadPoolLoop()
10. {
11.     int toProcess = this.CollectionToIterateOver.Count;
12.     using (ManualResetEvent resetEvent = new ManualResetEvent(false))
13.     {
14.         for (int i = 0; i < this.CollectionToIterateOver.Count; i++)
15.         {
16.             ThreadPool.QueueUserWorkItem(new WaitCallback(x =>
17.             {
18.                 LoopExecutionEventArgs e = new LoopExecutionEventArgs();
19.                 StackFrame stackFrame = new StackFrame();
20.                 e.MethodName = stackFrame.GetMethod().ToString();
21.                 e.LoopExecutionNumber = (int)x;
22.                 e.ObjectToOperateOn = CollectionToIterateOver[(int)x];
23.                 OnExecuteMethod(this, e);
24.                 if (Interlocked.Decrement(ref toProcess) == 0)
25.                     resetEvent.Set();
26. 
27.             }
28.                             ), i);
29.         }
30.         resetEvent.WaitOne();
31.     }
32. }

The first method is the same as we have seen in each of the other cases. The second method gets more interesting. As we are using the standard .NET Thread pool, there is a potential race condition. The Thread pool starts each worker thread off as a background process. This means we need to enforce a stop condition on all the threads completing before the application carries on executing, else we will get an incorrect reading on the amount of time it takes to execute using this method.

This is done by using a ManualResetEvent and forcing it to wait on line 30. This is set to run once all the threads have completed running by the set event on line 25. Line 25 is only called once toProcess reaches zero. toProcess is set to be the number of items in the list on line 11. This is one of the advantages of using an anonymous method in this situation as you don't need to pass in a ref to toProcess as the anonymous method can make use of it.

Task Parallel Library

The final method uses the new Task Parallel Library from .NET 4.0.

The methods look like:

1. private void TaskParallelExecute()
2. {
3.     watch.Start();
4.     TaskParallelLibraryLoop();
5.     TaskLoopTime = watch.ElapsedMilliseconds;
6.     watch.Reset();
7. }
8. 
9. private void TaskParallelLibraryLoop()
10. {
11.     Parallel.For(0, this.CollectionToIterateOver.Count, i =>
12.         {
13.             LoopExecutionEventArgs e = new LoopExecutionEventArgs();
14.             StackFrame stackFrame = new StackFrame();
15.             e.MethodName = stackFrame.GetMethod().ToString();
16.             e.LoopExecutionNumber = i;
17.             e.ObjectToOperateOn = CollectionToIterateOver[i];
18.             OnExecuteMethod(this, e);
19.         })
20.     ;
21. }

The first method is the same as the other three however it is with the second method that we use that may be unfamiliar to people. According to MSDN, the method signature for a Parrallel.For is:

public static ParallelLoopResult For(
        int fromInclusive,
        int toExclusive,
        Action<int> body
)

where the body is a delegate that is executed once per iteration.

The method passes in to the delegate an int representing the current iteration that will be executed. In this case, the delegate is an anonymous method that is used to call the delegate that will do our work.

Matrices of Execution

All of the code used to run an application that outputted the results for the Matrices is located in LoopExecution.Application. This is currently a console application which prompts for the number of times you wish to run the loop in incremental steps of one and the two current types of delegates used to test the application. Future version will hopefully have a GUI and be able to dynamically load up custom objects and methods to work on.

Computer Specification

For the sake of completeness, the test machine being used to run this application on is an Intel Core 2 Duo T600 @ 2.00GHz 2.00GHz with 3.00GB of RAM and Windows 7 Home Premium SP1 32 bit OS.

Now the class that will be used to perform the matrix has been explained, let's run it a few times and see what results we get.

Simple Test

The first test will be run using a method that only writes to the console. This method is shown below:

1. static void loopExecution_ExecutionMethodConsoleWriteLine
	(object sender, LoopExecution<int>.LoopExecutionEventArgs e)
2. {
3.     Console.WriteLine("{0} number {1}. Thread name : {2}", 
	e.MethodName, e.LoopExecutionNumber, Thread.CurrentThread.ManagedThreadId);
4. }

All this does is a simple print line.

If we run this for 128 times, we get the following results:

Total run time for Normal loop 2210ms 
Total run time for Thread loop 53663ms 
Total run time for Thread Pool 3255 ms 
Total run time for Task Parallel Library 3250ms 

The full results for this can be found in the ThreadReultsWriteLine.csv.

There are a few interesting points to notice from these results.

The first is that although it isn't shown in the results on some of the runs, the Task Parallel Library did run quicker than the Normal loop but over the course of the entire run, the Normal loop ran quicker.

The second and most obvious point is that it doesn’t always make sense to run a process on separate threads. The reason is that the overhead of starting a new thread is greater than the time it takes to actually execute the delegate.

Simulating Work Test

So as to have an example of a delegate that stimulates actually doing some work that can take a variety of time, the following delegate was used to run the test 128 times again.

1. static void loopExecution_ExecutionMethodSleep
	(object sender, LoopExecution<int>.LoopExecutionEventArgs e)
2. {
3. 
4.     if (e.LoopExecutionNumber % 11 == 0)
5.     {
6.         Thread.Sleep((int)(e.LoopExecutionNumber * 0.149) + 1);
7.     }
8.     else if (e.LoopExecutionNumber % 23 == 0)
9.     {
10.         Thread.Sleep((int)(e.LoopExecutionNumber * 0.344));
11.     }
12.     Console.WriteLine("{0} number {1}. Thread name : {2}", 
	e.MethodName, e.LoopExecutionNumber, Thread.CurrentThread.ManagedThreadId);
13. }

The above method just sends the current thread of execution to sleep depending on the current iteration. These numbers were picked at random however I did them at such intervals and with the multiplication factor in order that the test doesn’t take too long to run. Also when running a live application, the time it takes to complete each task for an object could be different depending on whether it was accessing a network share, database, internet, local disk or memory object.

If we run this 128 times, we get the following results:

Total run time for Normal loop 11838ms 
Total run time for Thread loop 63148ms 
Total run time for Thread Pool 6150ms 
Total run time for Task Parallel Library 5110ms 

The full results for this test can be found in ThreadReultsSleep.csv.

The most interesting statistic to notice here is that in this case, the two methods used pre created threads or task was the quickest. This was because off the obvious reason that when we sent the thread to sleep, other threads will be operated on whereas for the normal loop as it is only one thread, it has to wait for the sleep to complete before it can move onto the next iteration.

The other statistic of note if that there isn't much difference between the Thread Pool and the Task Parallel Library, although that will grow if more threads are used. However apart from the fact that the Task Parallel Library was quicker if you re examine the code the Task Parallel Library is much easier to read, understand and use. The other benefit of using this approach is you don't need to worry about all threads completing before you move on to the next stage of the application. So from a maintenance point of view, Task Parallel Library wins.

Conclusion

So in conclusion, for a real world program where the operation being performed is a quick operation, it doesn’t make sense to go straight for a multi threaded option.

If however the operation being performed is going to be processes intensive or depends on other external factors that mean the thread will be in a blocked state, then it is worth using a multi threaded application.

History

  • 8th March, 2011: Initial post

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

Michael Bookatz
Software Developer
United Kingdom United Kingdom
I'm a software developer with 4 years experience of using C# and Microsoft SQL server.
 
I work on line of business desktop applications that cross the entire development stack with the full development life cycle.

Comments and Discussions

 
GeneralMy vote of 3 PinmemberAmol Rawke Pune10-Mar-11 17:30 
GeneralRe: My vote of 3 PinmemberMichael Bookatz10-Mar-11 22:45 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web01 | 2.8.140721.1 | Last Updated 8 Mar 2011
Article Copyright 2011 by Michael Bookatz
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid