Click here to Skip to main content
15,887,027 members
Articles / Programming Languages / C#
Article

Iteration Performance in .NET

Rate me:
Please Sign up or sign in to vote.
4.69/5 (19 votes)
20 May 20031 min read 110.8K   567   25   21
Article on the relative performance of various methods of iterating through large amounts of data.

Introduction

I’ve been implementing numerical libraries in .NET and have come to some conclusions about iteration performance. My classes have to hold a large amount of data and be able to iterate through that data as quickly as possible. In order to compare various methods, I created a simple class called Data that encapsulates an array of doubles.

Method #1: Enumeration

Data implements IEnumerable. It contains GetEnumerator which returns its own DataEnumerator, an inner class.
C#
public DataEnumerator GetEnumerator() 
    {
      return new DataEnumerator( this );
    }

    IEnumerator IEnumerable.GetEnumerator()
    {
      return GetEnumerator();
    }
  }
	
…
  public struct DataEnumerator : IEnumerator
  {
    private Data internal_;
    private int index_;

    public DataEnumerator( Data data ) 
    {
      internal_ = data;
      index_ = 1;
    }

    public double Current
    {
      get
      {
        return internal_.Array[index_];
      }
    }

    object IEnumerator.Current 
    {
      get
      {
        return Current;
      }
    }
  
    public bool MoveNext()
    {
      index_++;
      if ( index_ >= internal_.Array.Length ) 
      {
        return false;
      }
      return true;
    }

    public void Reset()
    {
      index_ = -1;
    }
  }
}
…

Method #2: Indexing

I implemented an index operator on the class which simply calls the index operator on the array.

C#
public double this[int position]
{
  get
  {
    return array_[position];
  }
}

Method #3: Indirect Array

I created a property to access the array.

C#
public double[] Array
{
  get
  {
    return array_;
  }
}

When iterating, I called the Array property and then its index operator.

C#
d = data.Array[j];

Method #4: Direct Array

I created a reference to the array.

C#
double[] array = data.Array;

Then, I iterate through that reference.

C#
d = array[j];

Method #5: Pointer Math

Finally, I tried improving performance by iterating through the array in Managed C++ using pointer manipulation.

MC++
static void iterate( Data& data )
{
  double d;
  double __pin* ptr = &( data.Array[0] );
  for ( int i = 0; i < data.Array.Length; i++ )
  {
    d = *ptr;
    ++ptr;
  }
}

I called it this way:

MC++
Pointer.iterate( data );

Conclusions

To test the different methods, I allocated 1,000,000 doubles into an array and indexed over all of them. I repeated this 1,000 times to minimize randomness. Here are the results...

Results

Enumeration is always slow. That’s not surprising as I’m using a general data structure to hold the doubles. The three operator/property methods differed very slightly. These are probably all optimized similarly. Using pointer math to traverse over the raw data was significantly faster. This is probably due to the fact that there’s no bounds checking. In summary, if you have large amounts of data and performance is critical, consider using managed C++.

Acknowledgements

Thanks to Mark Vulfson of ProWorks for tips on using Flipper Graph Control. Also, to my colleagues Ken Baldwin and Steve Sneller at CenterSpace Software.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here


Written By
CEO CenterSpace Software
United States United States
Trevor has held demanding development positions for a variety of firms using C++, Java, .NET, and other technologies, including Rogue Wave Software, CleverSet, and ProWorks. He is coauthor of The Elements of Java Style , The Elements of C++ Style, and The Elements of C# Style, published by Cambridge University Press. He has also served on a course advisory board of the University of Washington. His teams have won the JavaWorld "GUI Product of the Year" and XML Magazine "Product of the Year" awards. Trevor holds a BSc in Computer Science from the University of British Columbia and a BA in Economics from the University of Western Ontario.

Comments and Discussions

 
GeneralNice Pin
Anthony_Yio13-Oct-03 15:51
Anthony_Yio13-Oct-03 15:51 
GeneralEnumerations Pin
Paul Evans26-May-03 23:19
Paul Evans26-May-03 23:19 
GeneralRe: Enumerations Pin
asm4328-May-03 11:21
asm4328-May-03 11:21 
GeneralUpdate Pin
Trevor Misfeldt7-May-03 5:55
Trevor Misfeldt7-May-03 5:55 
GeneralRe: Update Pin
dog_spawn21-May-03 16:22
dog_spawn21-May-03 16:22 
GeneralRe: Update Pin
Morten Sparding11-Mar-05 22:28
Morten Sparding11-Mar-05 22:28 
Generalunsafe C# Pin
John Rayner7-May-03 5:04
John Rayner7-May-03 5:04 
GeneralRe: unsafe C# Pin
Trevor Misfeldt7-May-03 5:53
Trevor Misfeldt7-May-03 5:53 
GeneralEnumeration performance Pin
James T. Johnson2-May-03 1:59
James T. Johnson2-May-03 1:59 
GeneralRe: Enumeration performance Pin
Trevor Misfeldt2-May-03 4:29
Trevor Misfeldt2-May-03 4:29 
GeneralRe: Enumeration performance Pin
Andy Gray6-May-03 6:23
Andy Gray6-May-03 6:23 
GeneralRe: Enumeration performance Pin
JedrekM20-Jun-03 8:35
JedrekM20-Jun-03 8:35 
GeneralRe: Enumeration performance Pin
Trevor Misfeldt20-Jun-03 9:46
Trevor Misfeldt20-Jun-03 9:46 
GeneralRe: Enumeration performance Pin
Oscar Bowyer6-May-03 3:44
Oscar Bowyer6-May-03 3:44 
GeneralRe: Enumeration performance Pin
James T. Johnson6-May-03 4:36
James T. Johnson6-May-03 4:36 
GeneralRe: Enumeration performance Pin
mikasa22-May-03 3:11
mikasa22-May-03 3:11 
GeneralRe: Enumeration performance Pin
James T. Johnson22-May-03 9:11
James T. Johnson22-May-03 9:11 
What you're describing is a dictionary-style collection. The framework has a generic one: System.Collections.Hashtable, and CollectionGen has one called HashTable.

HashTables work on the concept of buckets and hashes. The table maintains an array of 'buckets', each bucket is then also has an array or linked list of the key-value pairs you want to store.

The hash is a number generated from the key and is used as in index into the array of buckets, more on this later; for now just know that it is an Integer

To retreive a value out of the dictionary: Once it finds the correct bucket it then goes through all of the items in that bucket to find the key that is equal to the one used. If it finds the key then it returns that value, otherwise it returns Nothing

To add a value to the dictionary: Once it finds the correct bucket it goes through each of the items to see if the key already exists, if it doesn't then it adds your key-value pair to that bucket. If it does find your key it either replaces the value stored there or it raises an exception (if you are using the Item property or Add method, respectively)

While HashTables have a lot going on behind the scenes, there is one major downfall to them. The performance of the table relies on a good hashing algorithm. Unlike Java, .NET lets any object serve as the key to a HashTable because System.Object has a GetHashCode method which is supposed to return the hash for that object.

From the documentation, GetHashCode's default behavior is suitable unless you override the Equals method and change it from using reference equality to something else.

This adds a frusturating aspect to using your own classes in a HashTable. If two objects are equal (the Equals method returns true) then the hash returned by GetHashCode must be the same. So for the duration that your object is serving as a key it must be immutable. Most objects that people write aren't this way so they make poor candidates for use as a key.

HTH,

James

"I despise the city and much prefer being where a traffic jam means a line-up at McDonald's"
Me when telling a friend why I wouldn't want to live with him

GeneralRe: Enumeration performance Pin
mikasa22-May-03 9:18
mikasa22-May-03 9:18 
GeneralRe: Enumeration performance Pin
James T. Johnson22-May-03 9:29
James T. Johnson22-May-03 9:29 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.