Click here to Skip to main content
15,892,674 members
Articles / Programming Languages / C#

Iterators in C# - A Deep Dive

Rate me:
Please Sign up or sign in to vote.
4.00/5 (4 votes)
28 Mar 2014CPOL5 min read 15.2K   12  
Iterators in C#

Introduction

This article explains the in-depth analysis of how the C# yield keyword works under the hood.

If you don't have any idea about yield keyword or have never used it before, check out my post on Iterators in C# on my original blog or on CodeProject.

Using Iterators is easy, but it's always good to know how this thing works under the hood, right?

Well for the purpose of understanding, let's have a simple example of C# method, which returns a list of values.

Here is the code:

C#
public class InDepth
    {
        static IEnumerator DoSomething()
        {
            yield return "start";

            for (int i = 1; i < 3; i++)
            {
                yield return i.ToString();
            }

            yield return "end";
        }
    }

It's pretty much simple, isn't it ? Let's have a look at the compiled code:

C#
using System;
using System.Collections;
using System.Collections.Generic;
using System.Diagnostics;
using System.Runtime.CompilerServices;

namespace YieldDemo
{
  public class InDepth
  {
    public InDepth()
    {
      base..ctor();
    }

    private static IEnumerator DoSomething()
    {
      InDepth.<DoSomething>d__0 doSomethingD0 = new InDepth.<DoSomething>d__0(0);
      return (IEnumerator) doSomethingD0;
    }

    [CompilerGenerated]
    private sealed class <DoSomething>d__0 : IEnumerator<object>, IEnumerator, IDisposable
    {
      private object <>2__current;
      private int <>1__state;
      public int <i>5__1;

      object IEnumerator<object>.Current
      {
        [DebuggerHidden] get
        {
          return this.<>2__current;
        }
      }

      object IEnumerator.Current
      {
        [DebuggerHidden] get
        {
          return this.<>2__current;
        }
      }

      [DebuggerHidden]
      public <DoSomething>d__0(int <>1__state)
      {
        base.\u002Ector();
        this.<>1__state = param0;
      }

      bool IEnumerator.MoveNext()
      {
        switch (this.<>1__state)
        {
          case 0:
            this.<>1__state = -1;
            this.<>2__current = (object) "start";
            this.<>1__state = 1;
            return true;
          case 1:
            this.<>1__state = -1;
            this.<i>5__1 = 1;
            break;
          case 2:
            this.<>1__state = -1;
            ++this.<i>5__1;
            break;
          case 3:
            this.<>1__state = -1;
            goto default;
          default:
            return false;
        }
        if (this.<i>5__1 < 3)
        {
          this.<>2__current = (object) this.<i>5__1.ToString();
          this.<>1__state = 2;
          return true;
        }
        else
        {
          this.<>2__current = (object) "end";
          this.<>1__state = 3;
          return true;
        }
      }

      [DebuggerHidden]
      void IEnumerator.Reset()
      {
        throw new NotSupportedException();
      }

      void IDisposable.Dispose()
      {
      }
    }
  }
}

Shocked! I just wrote hardly 10 LOC(Lines of Code), but the compiler generated too many lines. Well, the compiler creates auto-generated state machines to implement yield functionality. Let's examine the code that is compiled.

Overall Observation

  1. The code shown is not a valid C# code: Yes, the code is not valid. We'll use a valid C# code to write programs and logic and if the compiler uses the same valid code, it causes conflicts with the method and variable declarations during the compilation process.
  2. Some of the methods are decorated with [CompilerGenerated] and [DebuggerHidden] attributes. The compiler generated attribute distinguishes the compiler generated element to a user generated element while the DebuggerHidden attribute stops the method from debugging.
  3. <DoSomething>d__0 implements three interfaces, IEnumerator<object>, IEnumerator, IDisposable but we have implemented only one Interface. Well the compiler implemented a generic form of IEnumerator even though we have implemented non-generic form of IEnumerator. IEnumerator<object> implies the other two interfaces.

There's a whole lot of magic happening in <DoSomething>d__0. Let's have a closer look at it.

  1. Three variables are declared in the method. Namely <>1__state, <>2__current and <i>5__1. <>1_state keeps tracking where the code has reached. <>2__current will return the current value from the iterator. <i>5__1 is just the count variable.
  2. State and current are declared as private while count is declared as public. If we use any parameters to in the Iterator block, those variables will also be public.
  3. There is an important thing to note here. DoSomething() method calls <DoSomething>d__0 which always passes 0 to the constructor. This parameter may vary based on the return type used for the Iterator block. For example, if we use IEnumerable<int> as return type, then it passes the initial value as "-2", instead of 0.
  4. There are two versions of the Current property. They both return <>2__current. MoveNext(), Reset, Dispose are the methods implemented.
  5. The Reset() method always throws NotSupportedException exception. This is normally as per the C# specification.
  6. Whatever the code you write in the Iterator block goes in to the MoveNext() method. Its always a switch statement. The values for current, state, count are modified in this method itself. You can observe the condition statement for the switch is the current state. Based on the current state, the values are modified and returned.

The Iterator doesn't just run on its own. When the Iterator method is called, it is just created. The actual process starts when a call to MoveNext() is made. The MoveNext() is called repeatedly until yield break or yield return or at the end of the method is reached.

An important thing to note in the Iterators is that you cannot yield from a try block with a catch block associate with it or with catch and finally blocks. But you can yield from a try block which only has a finally block without a catch.

Till now, we've been returning IEnumerator from the Iterator block. Let's replace IEnumerator with IEnumerable. Also note that the IEnumerator returned from the

Iterator block earlier is a non-generic version. We'll use the IEnumerable with a generic form to implement Iterator block once again. Here is the code after modification.

C#
static IEnumerable<string> DoSomething(){
    yield return "start";

    for (int i = 1; i < 3; i++)
    {
        yield return i.ToString();
    }

    yield return "end";
}

Also, let's have our compiled code in place. We'll check what's new with the IEnumerable implementation. Here is the code:

C#
using System;
using System.Collections;
using System.Collections.Generic;
using System.Diagnostics;
using System.Runtime.CompilerServices;

namespace YieldDemo
{
  public class InDepth
  {
    public InDepth()
    {
      base..ctor();
    }

    private static IEnumerable<string> DoSomething()
    {
      InDepth.<DoSomething>d__0 doSomethingD0 = new InDepth.<DoSomething>d__0(-2);
      return (IEnumerable<string>) doSomethingD0;
    }

    [CompilerGenerated]
    private sealed class <DoSomething>d__0 : IEnumerable<string>, 
    IEnumerable, IEnumerator<string>, IEnumerator, IDisposable
    {
      private string <>2__current;
      private int <>1__state;
      private int <>l__initialThreadId;
      public int <i>5__1;

      string IEnumerator<string>.Current
      {
        [DebuggerHidden] get
        {
          return this.<>2__current;
        }
      }

      object IEnumerator.Current
      {
        [DebuggerHidden] get
        {
          return (object) this.<>2__current;
        }
      }

      [DebuggerHidden]
      public <DoSomething>d__0(int <>1__state)
      {
        base..ctor();
        this.<>1__state = param0;
        this.<>l__initialThreadId = Environment.CurrentManagedThreadId;
      }

      [DebuggerHidden]
      IEnumerator<string> IEnumerable<string>.GetEnumerator()
      {
        InDepth.<DoSomething>d__0 doSomethingD0;
        if (Environment.CurrentManagedThreadId == this.<>l__initialThreadId && this.<>1__state == -2)
        {
          this.<>1__state = 0;
          doSomethingD0 = this;
        }
        else
          doSomethingD0 = new InDepth.<DoSomething>d__0(0);
        return (IEnumerator<string>) doSomethingD0;
      }

      [DebuggerHidden]
      IEnumerator IEnumerable.GetEnumerator()
      {
        return (IEnumerator) this.System.Collections.Generic.IEnumerable<System.String>.GetEnumerator();
      }

      bool IEnumerator.MoveNext()
      {
        switch (this.<>1__state)
        {
          case 0:
            this.<>1__state = -1;
            this.<>2__current = "start";
            this.<>1__state = 1;
            return true;
          case 1:
            this.<>1__state = -1;
            this.<i>5__1 = 1;
            break;
          case 2:
            this.<>1__state = -1;
            ++this.<i>5__1;
            break;
          case 3:
            this.<>1__state = -1;
            goto default;
          default:
            return false;
        }
        if (this.<i>5__1 < 3)
        {
          this.<>2__current = this.<i>5__1.ToString();
          this.<>1__state = 2;
          return true;
        }
        else
        {
          this.<>2__current = "end";
          this.<>1__state = 3;
          return true;
        }
      }

      [DebuggerHidden]
      void IEnumerator.Reset()
      {
        throw new NotSupportedException();
      }

      void IDisposable.Dispose()
      {
      }
    }
  }
}

Observations

  1. At first, the return type of the DoSomething() method is changed to IEnumerable<string>.
  2. Also, noticeably the parameter passing to the <DoSomething>d__0() constructor has changed from 0 to -2.
  3. The compiler generated <DoSomething>d__0 class implements IEnumerable<string>, IEnumerable along with IEnumerator<string> and the others.
  4. The implementation of the IEnumerator<int> in the sealed class implements almost the same as IEnumerator. The Current property just has the current value to return, Reset throws the same exception and MoveNext() has the same logic.
  5. A private variable <>l__initialThreadId is added, set in the constructor to the current thread.

Well, what happened? When the instance of IEnumerable<string> is created, then GetEnumerator() method is called, which returns an IEnumerator interface and methods in the IEnumerator were carried on. Also a readonly access to the collection is turned on. Its the MoveNext() method that is operated over and over again to return the values lazily.

Why is the initial call to DoSomething constructor changed from 0 to -2. Well, these are the codes to tell the compiler what state they are in. Here are the states that the state machine operates on.

  • 0: indicates the "work is yet to start"(Before) .
  • -1: indicates the "work is in progress"(Running) or "work is completed" (After).
  • -2: This is specific to IEnumerable. This is the initial state for IEnumerable before the call to GetEnumerator is made.
  • Greater than 0: indicates the resuming state.

Also a point to note here is that -2 state is specific to IEnumerable. The other states are specific to IEnumerator. So when the GetEnumerator method is called by the IEnumerable, the state will be changed to 0 and so on as it returns IEnumerator interface.

That's it! At first glance, it looks freaky, but when we slowly started understanding, it has become a lot more easier than what we expected.

Please share your thoughts and reviews on this post! Thanks!

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer (Senior)
India India
Developer. Blogger.

Follow me on Code Rethinked

Comments and Discussions

 
-- There are no messages in this forum --