Yield Return Could Be Better

Paulo Zemek

Rate me:

4.79/5 (14 votes)

5 Aug 2011CPOL6 min read

32.2K

This article will talk about the yield return keyword, will talk about an alternative and will talk about the theory of a better yield return;

Background

I recently published the article "Writing a Multiplayer Game (in WPF)". In my own revisions to avoid the new compiler generated State Machine I managed to create an alternative that uses more resources but has some advantages. Then, thinking about it, I really believe that yield return could be better, and I decided to write an article to explain why and to ask people to ask that feature to Microsoft.

The feature request is at:

http://visualstudio.uservoice.com/forums/121579-visual-studio/suggestions/2107187-create-a-stacksaver-class-to-facilitate-state-mana

yield return keyword

The "yield return" and "yield break" keywords are used inside the body of methods that returns IEnumerator<T> or IEnumerable<T> values. In fact the compiler will generate the state machine for us.

For example, if you combine two enumerations using yield return, you will write something like this:

public static IEnumerable<T> Combine<T>(IEnumerable<T> first, IEnumerable<T> second)
{
  if (first != null)
    foreach(T value in first)
      yield return value;

  if (second != null)
    foreach(T value in second)
      yield return value;
}

And the compiler will in fact implement a new class that is IEnumerable<T> and will keep track of where it is at each call of MoveNext. In fact the actual implementation made by the compiler is very ugly, so I will try to present a more human readable one. The code to do the exact same job without the yield return will look like this:

public static IEnumerable<T> Combine<T>(IEnumerable<T> first, IEnumerable<T> second)
{
  return new CombineEnumerable<T>(first, second);
}
internal sealed class CombineEnumerable<T>:
  IEnumerable<T>
{
  private IEnumerable<T> _first;
  private IEnumerable<T> _second;

  internal CombineEnumerable(IEnumerable<T> first, IEnumerable<T> second)
  {
    _first = first;
    _second = second;
  }

  public IEnumerator<T> GetEnumerator()
  {  
    IEnumerator<T> firstEnumerator = null;
    IEnumerator<T> secondEnumerator = null;

    if (_first != null)
      firstEnumerator = _first.GetEnumerator();

    if (_second != null)
      secondEnumerator = _second.GetEnumerator();

    return new CombineEnumerator<T>(firstEnumerator, secondEnumerator);
  }
  IEnumerator IEnumerable.GetEnumerator()
  {
    return GetEnumerator();
  }
}
internal sealed class CombineEnumerator<T>:
  IEnumerator<T>
{
  private IEnumerator<T> _first;
  private IEnumerator<T> _second;
  private int _state;

  internal CombineEnumerator(IEnumerator<T> first, IEnumerator<T> second)
  {
    _first = first;
    _second = second;
  }

  public T Current { get; private set; }

  public void Dispose()
  {
    IEnumerator<T> first = _first;
    if (first != null)
    {
      _first = null;
      first.Dispose();
    }
        
    IEnumerator<T> second = _second;
    if (second != null)
    {
      _second = null;
      second.Dispose();
    }
  }

  object IEnumerator.Current
  {
    get
    {
      return Current;
    }
  }

  public bool MoveNext()
  {
    switch(_state)
    {
      case 0:
        if (_first == null || !_first.MoveNext())
        {
          _state = 1;
          goto case 1;
        }

        Current = _first.Current;
        return true;

      case 1:
        if (_second == null || !_second.MoveNext())
        {
          _state = 2;
          return false;
        }

        Current = _second.Current;
        return true;

      default:
        return false;
    }
  }

  public void Reset()
  {
    if (_first != null)
      _first.Reset();

    if (_second != null)
      _second.Dispose();

    _state = 0;
  }
}

As you can see, the "yield return" keyword is doing a lot of work for us. But let's see some limitations of the yield return.

* It can't be called inside catch or finally blocks. The reason for that is because it is impossible to do a "goto" to a finally or catch clause. Considering that the yield return is not a real .Net resource but a compiler one, it must deal with that limitation. * If you want to put the yield return inside an inner method, you can't. The best you can do is to always call a method, like BeforeYield(value) and then yield return value;.

So, if I want a Combine method that only returns even numbers, I will need to write it like this:

public static IEnumerable<int> Combine(IEnumerable<int> first, IEnumerable<int> second)
{
  if (first != null)
    foreach(int value in first)
      if ((value % 2) == 0)
        yield return value;

  if (second != null)
    foreach(int value in second)
      if ((value % 2) == 0)
        yield return value;
}

or like this

public static IEnumerable<int> Combine(IEnumerable<int> first, IEnumerable<int> second)
{
  if (first != null)
    foreach(int value in first)
      if (_IsEven(value))
        yield return value;

  if (second != null)
    foreach(int value in second)
      if (_IsEven(value))
        yield return value;
}
private static bool _IsEven(int value)
{
  return (value % 2) == 0;
}

If the test is not so simple (as the _IsEven is) you will probably prefer the second approach. Another approach is to combine a EnumerateEvenNumbers with the original combine. So one enumerator will combine other two and the other will filter the even numbers. But, to me, the best solution would be like this:

public static IEnumerable<int> Combine(IEnumerable<int> first, IEnumerable<int> second)
{
  if (first != null)
    foreach(int value in first)
      _YieldIfEven(value);

  if (second != null)
    foreach(int value in second)
      _YieldIfEven(value);
}

// It is not an enumerator, but calls yield return as it is called from one.
private static void YieldIfEven(int value)
{
  if ((value % 2) == 0)
    yield return value;
}

That's impossible to do at the moment, but I was really looking forward on how I can implement something like this.

Using threads and synchronization, I can already do something like this:

public static IEnumerable<int> Combine(IEnumerable<int> first, IEnumerable<int> second)
{
  return 
    new ThreadedEnumeratorFromAction<T>
    (
      (yieldReturn) =>
      {
        if (first != null)
          foreach(int value in first)
            _YieldIfEven(value, yieldReturn);

        if (second != null)
          foreach(int value in second)
            _YieldIfEven(value, yieldReturn);
      }
    );
}
private static bool _IsEven(int value, Action<T> yieldReturn)
{
  if ((value % 2) == 0)
    yieldReturn(value);
}

In fact, this approach works because there is another thread running this code. The first is simple kept waiting for this one to finish.

The disadvantage? Many. There needs to be a thread and two AutoResetEvents to do the job. I already use my own thread pool and EventWaitHandle pool to try to minimize the overhead, but there is another problem. If the first thread has a lock, even if it is waiting for the second thread, such second thread can't acquire that lock, or it will be a dead lock. If this solution worked in the same thread the lock will already be acquired, so there will be no problem.

The advantage? Well, methods don't need a special compiler resource to work. If they can receive actions or are inheritors of that class, any method can "yield return" at any moment. If you pass the "yieldReturn" as an action, you can in fact pass an action to process the value immediatelly, an action to post that value to another thread or an action that will force such "yieldReturn", so your method can be much more versatile than when using the compiler generated "yield return". Also, values can be given at any moment, including finally and catch blocks.

The Theory of a StackSaver (or StackStateMachine)

The stack already has all values "stored", but it is sequential. We can't copy part of the stack to another object an go back some levels in the stack. But at least when I programmed in 680x0 assembler, it was possible to manipulate the stack directly.

So, my idea is something like this. There will be a class (I called it StackSaver, but maybe StackStateMachine is better). It is initialized with an action and is Enumerable. When you call MoveNext, it will store its actual Stack position and run until its YieldReturn method is called.

When that happens, it copies the Stack from its activation until this moment to its own store, goes back in the stack and returns true. If the method ends, it returns false.

So, considering it receives a simple Action (not Action<object> or similar) the code could look like this:

private static IEnumerable<int> Combine(IEnumerable<int> first, IEnumerable<int> second)
{
  return
    new StackSaver
    (
      () =>
      {
        if (first != null)
          foreach(int value in first)
            _YieldIfEven(value);

        if (second != null)
          foreach(int value in second)
            _YieldIfEven(value);
      }
    );
}
private static bool _IsEven(int value)
{
  if ((value % 2) == 0)
    StackSaver.StaticYieldReturn(value); // this method will return to the last active StackSaver.
}

Even if this initial code looks worst than the "yield return" version, there are many advantages:

The compiler will not need to support the auto-generation of the yield return to take advantage of this. Supporting Action is enough.
I used the Static version of the YieldReturn, but the instance version can also exist and, if you receive it by a delegate your method will still be able to process the results directly (without yielding) and it will be possible to yield to the first StackSaver if one is used inside the other.
If they really implement this, I really suggest they also create an IIterator interface to be used for iterators that don't need to return a value. I will surely use it in my game.
If there is a lock held by the caller, the action running in the StackSaver will also has it, so there will be no dead-locks caused by that.

Dispose or Collection

Some may argue that if we create such stack saver and forgot to use it we will create problems.

Well, I don't think so. When calling dispose, something like the ThreadAbortException can be thrown. It is a catchable exception that always rethrows, so it forces finally and catch blocks to execute, but nothing else. This can happen at Dispose or when the Thread dies if it was not disposed.

Is it easy to implement?

I really don't know how Microsoft made the call stack of the .Net work but I think it is easier to implement than the yield return keyword was. After all, the Stack is already managed, how difficult will be to copy it to an object and then return some levels?

If you agree with me that this could be a very useful resource, enter at http://visualstudio.uservoice.com/forums/121579-visual-studio/suggestions/2107187-create-a-stacksaver-class-to-facilitate-state-mana and vote for it.

Source Code?

Well, the only thing I could really present is the ThreadedEnumeratorFromAction<T> class, but at this moment I only have the non-returning version of it that I am starting to use in my game. I want to release an update of the game soon that will be using it. So, be patient.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Written By

Paulo Zemek

Software Developer (Senior) Microsoft

United States

I started to program computers when I was 11 years old, as a hobbyist, programming in AMOS Basic and Blitz Basic for Amiga.
At 12 I had my first try with assembler, but it was too difficult at the time. Then, in the same year, I learned C and, after learning C, I was finally able to learn assembler (for Motorola 680x0).
Not sure, but probably between 12 and 13, I started to learn C++. I always programmed "in an object oriented way", but using function pointers instead of virtual methods.

At 15 I started to learn Pascal at school and to use Delphi. At 16 I started my first internship (using Delphi). At 18 I started to work professionally using C++ and since then I've developed my programming skills as a professional developer in C++ and C#, generally creating libraries that help other developers do their work easier, faster and with less errors.

Want more info or simply want to contact me?
Take a look at: http://paulozemek.azurewebsites.net/
Or e-mail me at: paulozemek@outlook.com

Codeproject MVP 2012, 2015 & 2016
Microsoft MVP 2013-2014 (in October 2014 I started working at Microsoft, so I can't be a Microsoft MVP anymore).

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.