Click here to Skip to main content
Click here to Skip to main content

Async Await and the Generated StateMachine

By , 28 Jan 2013
 

Introduction

Like everybody around, I went digging into .NET 4.5 to see what's new... but I didn't find any good step by step explanations about how the state machine actually works, so I used ilspy, and along with a good video from the creator, I got a good notion of what is what. (and you are welcome to have a peek too).

The State Machine

I read somewhere that you can think of it like this:

  • Call "before" code
  • Call await code as task
  • Store "after" code in a continuation task

But this isn’t really the case.

What really happens is actually something like this:

  • On compile time:
    • A struct called StateMachine is generated
    • Contains fields to save function local state
    • A moveNext function is created which holds the entire code
    • The code is fragmented by await calls into several cases (machine states)
    • A calling code which creates and initializes this machine replaces our async function code.
  • On Runtime: 
    • A task is created to run the machine code:
    • Local variables are “lifted” into the state machine as fields.
    • Code is run until await
    • A awaited function's task is run
    • Machine state is set to next state so next code fragment will run on wakeup
    • A wake-up event is scheduled
    • MoveNext function returns (Thread is released to do other stuff (update UI))
  • When wakeup call is issued by OS: 
    • Thread which handles the await continuation is called
    • CurrentSyncContext is used to pick the correct thread to run it on.
      • This behavior can be changed by using: await task.ConfigureAwait(false);
    • Next code segment is run since next state was set before yielding control
    • Another await is scheduled, [etc.]

See it written in the code

Two async functions decoded

// AsyncAwait.Engine.Downloader
public static async Task<string> DownloadHtmlAsyncTask(string url)
{
    HttpClient httpClient = new HttpClient();
    Debug.WriteLine("before await");
    string result = await httpClient.GetStringAsync(url);
    Debug.WriteLine("after await");
    return result;
}

// AsyncAwait.ViewModel.MainWndViewModel
private async Task<string> DownloadWithUrlTrackingTaskAsync(string url)
{
    Debug.WriteLine("before await1");
    string Data = await DownloadHtmlAsyncTask(url);
    Debug.WriteLine("before await2");
    string Data1 = await DownloadHtmlAsyncTask(url);
    Debug.WriteLine("the end.");
    return Data;
}

The decompilation was done (using ilspy) with "decompile async methods" turned off.

First function

DownloadHtmlAsyncTask uses only one await call.

This is the calling code which initializes the state machine:

[DebuggerStepThrough, AsyncStateMachine(typeof(AsyncMethods.<DownloadHtmlAsyncTask>d__0))]
public static Task<string> DownloadHtmlAsyncTask(string url)
{
    AsyncMethods.<DownloadHtmlAsyncTask>d__0 <DownloadHtmlAsyncTask>d__;
    <DownloadHtmlAsyncTask>d__.url = url;
    <DownloadHtmlAsyncTask>d__.<>t__builder = AsyncTaskMethodBuilder<string>.Create();

    //set initial machine state to -1
    <DownloadHtmlAsyncTask>d__.<>1__state = -1;
    AsyncTaskMethodBuilder<string> <>t__builder = <DownloadHtmlAsyncTask>d__.<>t__builder;
    <>t__builder.Start<AsyncMethods.<DownloadHtmlAsyncTask>d__0>(ref <DownloadHtmlAsyncTask>d__);
    return <DownloadHtmlAsyncTask>d__.<>t__builder.get_Task();

}

The state machine definition: (my comments inline)

using ConsoleApplication1;
using System;
using System.Diagnostics;
using System.Net.Http;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;
[CompilerGenerated]
[StructLayout(LayoutKind.Auto)]
private struct <DownloadHtmlAsyncTask>d__0 : IAsyncStateMachine
{
    //initial state is set to -1 by the machine's creation code
    public int <>1__state;
    public AsyncTaskMethodBuilder<string> <>t__builder;
    public string url;
    public HttpClient <httpClient>5__1;
    public string <result>5__2;
    private TaskAwaiter<string> <>u__$awaiter3;
    private object <>t__stack;
    void IAsyncStateMachine.MoveNext()
    {
        string result;
        try
        {
            int num = this.<>1__state;

            //if (!Stop)
            if (num != -3)
            {
                TaskAwaiter<string> taskAwaiter;
                //machine starts with num=-1 so we enter
                if (num != 0)
                {
                    //first (+ initial) state code, run code before await is invoked
                    this.<httpClient>5__1 = new HttpClient();
                    Debug.WriteLine("before await");
                    
                    //a task is invoked
                    taskAwaiter = this.<httpClient>5__1.GetStringAsync(this.url).GetAwaiter();
                    
                    //[performance] check if this task has completed already,
                    //if it did, skip scheduling and boxing
                    if (!taskAwaiter.get_IsCompleted())
                    {
                        this.<>1__state = 0;
                        this.<>u__$awaiter3 = taskAwaiter;
                        
                        //Schedules the state machine to proceed
                        //to the next action when the specified awaiter completes.
                        //Also: sending this state machine here will trigger it's boxing into heap.
                        this.<>t__builder.AwaitUnsafeOnCompleted<TaskAwaiter<string>, 
                          AsyncMethods.<DownloadHtmlAsyncTask>d__0>(ref taskAwaiter, ref this);
                        
                        //release cpu knowing our next step will be called by Framework.
                        return;
                    }
                }
                else
                {
                    //set awaiter to null
                    taskAwaiter = this.<>u__$awaiter3;
                    this.<>u__$awaiter3 = default(TaskAwaiter<string>);
                    
                    //set state to initial state (temporarily)
                    this.<>1__state = -1;
                }
                
                //second (+ final) state code (state=0): set result to member, printout
                string arg_A5_0 = taskAwaiter.GetResult();
                
                //set awaiter to null
                taskAwaiter = default(TaskAwaiter<string>);
                
                //set StateMachine's result field, and end code (print out)
                string text = arg_A5_0;
                this.<result>5__2 = text;
                Debug.WriteLine("after await");
                
                //set return task's result
                result = this.<result>5__2;
            }
        }
        //exception handling is done here
        catch (Exception exception)
        {
            //set machine state to final state
            this.<>1__state = -2;
            this.<>t__builder.SetException(exception);
            return;
        }
        this.<>1__state = -2;
        this.<>t__builder.SetResult(result);
    }
    [DebuggerHidden]
    void IAsyncStateMachine.SetStateMachine(IAsyncStateMachine param0)
    {
        this.<>t__builder.SetStateMachine(param0);
    }
}
  • State machine is created as a struct, which means it's on stack
  • It is moved to heap inside the AwaitUnsafeOnCompleted function (deeper into the mechanism it is cast to IAsyncStateMachine which triggers the boxing). 
  • This call may be skipped if the task we await ends before the taskAwaiter.IsCoplete is checked.

This function has only two code fragments (two states) since there is only one await dividing the code, it may be a standard scenario but not an interesting one.

What happens when the StateMachine has more states? this brings us to the second example...

Second function

DownloadHtmlAsyncTask - using two await calls:

The calling code:

[DebuggerStepThrough, AsyncStateMachine(typeof(AsyncMethods.<DownloadWithUrlTrackingTaskAsync>d__5))]
private Task<string> DownloadWithUrlTrackingTaskAsync(string url)
{
    AsyncMethods.<DownloadWithUrlTrackingTaskAsync>d__5 <DownloadWithUrlTrackingTaskAsync>d__;
    <DownloadWithUrlTrackingTaskAsync>d__.<>4__this = this;
    <DownloadWithUrlTrackingTaskAsync>d__.url = url;
    <DownloadWithUrlTrackingTaskAsync>d__.<>t__builder = AsyncTaskMethodBuilder<string>.Create();
    
    //set initial machine state to -1
    <DownloadWithUrlTrackingTaskAsync>d__.<>1__state = -1;
    AsyncTaskMethodBuilder<string> <>t__builder = 
      <DownloadWithUrlTrackingTaskAsync>d__.<>t__builder;
    <>t__builder.Start<AsyncMethods.<DownloadWithUrlTrackingTaskAsync>d__5>(ref <DownloadWithUrlTrackingTaskAsync>d__);
    return <DownloadWithUrlTrackingTaskAsync>d__.<>t__builder.get_Task();
}

The state machine definition: (read the comments in the code)

using ConsoleApplication1;
using System;
using System.Diagnostics;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;
[CompilerGenerated]
[StructLayout(LayoutKind.Auto)]
private struct <DownloadWithUrlTrackingTaskAsync>d__5 : IAsyncStateMachine
{
    public int <>1__state;
    public AsyncTaskMethodBuilder<string> <>t__builder;
    public AsyncMethods <>4__this;
    public string url;
    public string <Data>5__6;
    public string <Data1>5__7;
    private TaskAwaiter<string> <>u__$awaiter8;
    private object <>t__stack;
    void IAsyncStateMachine.MoveNext()
    {
        string result;
        try
        {
            TaskAwaiter<string> taskAwaiter;
            //initialy state = -1, so we skip this at start
            switch (this.<>1__state)
            {
            case -3:
                //request machine to stop!
                //machine will go to end state and set result if one exists.
                goto IL_168;
            case 0:
                taskAwaiter = this.<>u__$awaiter8;
                this.<>u__$awaiter8 = default(TaskAwaiter<string>);
                //set state to initial state (temporarily)
                this.<>1__state = -1;
                goto IL_A1;
            case 1:
                taskAwaiter = this.<>u__$awaiter8;
                this.<>u__$awaiter8 = default(TaskAwaiter<string>);
                //set state to initial state (temporarily)
                this.<>1__state = -1;
                goto IL_121;
            }
            
            // first state code state=-1, 
                        //printout, and await, then return control to scheduler
            Debug.WriteLine("before await1");
            taskAwaiter = AsyncMethods.DownloadHtmlAsyncTask(this.url).GetAwaiter();
            if (!taskAwaiter.get_IsCompleted())
            {
                //set state to next step (0)
                this.<>1__state = 0;
                this.<>u__$awaiter8 = taskAwaiter;
                this.<>t__builder.AwaitUnsafeOnCompleted<TaskAwaiter<string>, 
                  AsyncMethods.<DownloadWithUrlTrackingTaskAsync>d__5>(ref taskAwaiter, ref this);
                return;
            }
            IL_A1:
            //second state code (state = 0), set the result of the first call to member, 
                        //printout, schedule next await, yield control
            string arg_B0_0 = taskAwaiter.GetResult();
            taskAwaiter = default(TaskAwaiter<string>);
            string text = arg_B0_0;
            this.<Data>5__6 = text;
            Debug.WriteLine("before await2");
            taskAwaiter = AsyncMethods.DownloadHtmlAsyncTask(this.url).GetAwaiter();
            if (!taskAwaiter.get_IsCompleted())
            {
                //set state to next step (1)
                this.<>1__state = 1;
                this.<>u__$awaiter8 = taskAwaiter;
                this.<>t__builder.AwaitUnsafeOnCompleted<TaskAwaiter<string>, 
                  AsyncMethods.<DownloadWithUrlTrackingTaskAsync>d__5>(ref taskAwaiter, ref this);
                return;
            }
            IL_121:
            //third state code (state = 1), 
                        //set the result of the first call to member, printout, 
                        //set the (function's) return task's result.
            string arg_130_0 = taskAwaiter.GetResult();
            taskAwaiter = default(TaskAwaiter<string>);
            text = arg_130_0;
            this.<Data1>5__7 = text;
            Debug.WriteLine("the end.");
            result = this.<Data>5__6;
        }
        catch (Exception exception)
        {
            //some exception handling: set end state (-2)
            this.<>1__state = -2;
            //set the exception in the builder
            this.<>t__builder.SetException(exception);
            return;
        }
        IL_168:
        //if no exception set end state and result
        this.<>1__state = -2;
        this.<>t__builder.SetResult(result);
    }
    [DebuggerHidden]
    void IAsyncStateMachine.SetStateMachine(IAsyncStateMachine param0)
    {
        this.<>t__builder.SetStateMachine(param0);
    }
}

Points of Interest

  • The overhead:
    • Anyone can see that there are "several" extra code lines added.
    • But according to Microsoft:
      • About 40 operations are used to create async state machine (microseconds).
      • This overhead is similar or less than the other older async mechanisms.
      • This can be a problem only when calling many awaits in a tight loop.
    • Besides that, with today's CPUs who will notice the overhead when it's run on a non-GUI thread? Or even better - on the HDD controller...
  • Exception handling
    • Task.WhenAll will throw any exception from any of the tasks it aggregates.
    • This means that the first exception will be thrown out.
    • The task.Exception property will contain all the exceptions inside an AggregateException (you can use Task.Wait() instead of await)

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

Amit Bezalel
Software Developer (Senior) Hp Software
Israel Israel
Member
I've been all over the coding world since earning my degrees
have worked in c++ and java, finally setteling into c# about 6 years ago, where i spent a good amount of my time in Performance tweaking & memory debugging, as well as designing new solutions and hacking at old ones to stay in line.
 
Computers never cease to amaze me, and i'm glad to have found a field where i get paid to do what i enjoy.
 
I have been toying around with the idea of publishing stuff online for years, never actually getting around to it, so i still have a lot of stuff to write up, aside from all the other new stuff i'll get excited about, hope you'll like enjoy reading it as much as i enjoy writing.
 
linkedin
google plus

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
QuestionOverheadmemberDewey28 Jan '13 - 14:36 
Your statement - "Besides that, with today's CPUs who will notice the overhead when it's run on a non-GUI thread?"
 
If you're writing client code, this is absolutely true.
 
If it's server code where hundreds of users could be hitting it often, I begin to wonder, but I don't have anything solid to back up my feeling.
 
The saving grace may be the impact on the thread pool, where I've heard that async/await is the superior technology.
AnswerRe: OverheadmemberAmit Bezalel28 Jan '13 - 21:17 
This whole mechanism is geared towards client code (the synchronization context bit is a dead giveaway)
Also, when coding on the server side, you usually want the threads you are running in to be created and managed by the server and not by your own mechanisms, this is since threads created by TPL calls will not have direct access to HttpContext/current WCF object, making security & correct user logging a challange.
Even if you choose to use async await on server side, and you don't mind the non-server thread issues, the small overhead should be weighed against the time saving benefits of running in parallel (...and the programmer time saving benefits of clear, managable code Wink | ;) ).
GeneralRe: OverheadmemberDewey1 Feb '13 - 12:43 
Wow, I'm shocked at how little you know vs how much you're willling to talk!
 
You basically seem to know nothing about server programming. Having access to HttpContext is a non-issue, and the whole point of using Async/Await is to NOT use server threads, so your first response is a bit strange.
 
Async/Await is actually TPL, and we use it on the server all the time. You might want ot review how to use a enterprise service bus, or message passing to see how client communication is typically handled.
 
Having said that, my question was a simple one that applies to both client and server, the overhead won't change, just the speed of the processor(s) in some cases.
 
BTW, synchronization context wasn't the giveway, I sort of got that you were talking client side when you said GUI thread.
GeneralRe: OverheadmemberAmit Bezalel3 Feb '13 - 19:45 
Wow, sticks and stones.. really, how old are you?
 
The HttpContext problem is something that was an issue for me when working with WebServices on iis, and later when migrating to WCF.. so, even though you might not have an issue with it on your project it doesn't say nobody has a problem with it.
 
When coding pure infrastructure logic (like message passing), there is no problem, but when trying to do parallel work with BusinessLogic flows, you run into some difficulty, especially if you have some legacy code to work with.
You do have AspNetSynchronizationContext, but when using it, you don't really get a parallel behavior, since it makes sure tasks will run one at a time, each one capturing the context in turn, as you can see in Stephen Cleary's article:
"If multiple operations complete at once for the same application, AspNetSynchronizationContext will ensure that they execute one at a time."
 
While async await uses TPL it is not "actually TPL", it is a new api with it's own default behaviors - for example the different exception handling (first exception is thrown out), and the fact that the use of CurrentSyncContext is implicit, instead of explicit like TPL.
 
I am still convinced that the default behavior of this api is geared towards GUI applications (specifically the new Win8 metro GUI), but that doesn't mean that you can't use it to code server-side flows, just that you should know your stuff before you actually do it.
QuestionMessage Loop?memberThanasis I.28 Jan '13 - 12:03 
You mentioned at some point on runtime exectution,
A wake-up event is scheduled
Then later
When wakeup call is issued by OS:
Thread which handles the await continuation is called
 
If i understood correctly: Main Thread creates and runs a new async task (in another thread, for instance Thread2) and moves on. Later when Thread2 terminates, a wake-up event is fired in the Main Thread and the framework executes the code that corresponds to the continuation block.
 
So, the Main Thread must have a message loop, to catch the "wake-up" event, right?
 
What happens in Console applications, that do not have a message loop?
 
Or is it some other other construct of the .NET framework that handles this event?
 
-- Or is it that I did not understand correctly in the first place??
AnswerRe: Message Loop?memberAmit Bezalel28 Jan '13 - 20:20 
This is taken from MSDN:
If a synchronization context (SynchronizationContext object) is associated with the thread that was executing the asynchronous method at the time of suspension (for example, if the SynchronizationContext.Current property is not null), the asynchronous method resumes on that same synchronization context by using the context’s Post method. Otherwise, it relies on the task scheduler (TaskScheduler object) that was current at the time of suspension. Typically, this is the default task scheduler (TaskScheduler.Default), which targets the thread pool. This task scheduler determines whether the awaited asynchronous operation should resume where it completed or whether the resumption should be scheduled. The default scheduler typically allows the continuation to run on the thread that the awaited operation completed.
 
What actually happens is:
* The code runs synchronously on the main thread until await
* Await is called, and scheduled in another thread (or uses IO completion if relevant)
* When it's time to run the "after" block,
  1. If the SynchronizationContext exists for the orignial calling thread, it is used (which brings us to GUI thread)
  2. If no sync context is available the default scheduler is used, which takes a thread from pool. (this will happen when no pump exist, like in console apps)
see here: Bnaya Eshet's "the concept of async" blog
GeneralRe: Message Loop?memberThanasis I.29 Jan '13 - 12:28 
Thank you!
 
So if I understand correctly, for GUI apps it is 100% that the after block will execute in the UI thread (provided the method awaited at the UI thread too), so there is no need to check "InvokeRequired", and
 
for console apps the "after" block will possibly execute on a different thread.
GeneralRe: Message Loop?memberAmit Bezalel29 Jan '13 - 20:43 
Since the main idea is keeping the code flow as simple as a syncronous function, removing all those invokes and checks from the code is part of the solution.
 
If you are in a GUI application, but writing a non gui operation and you don't want it to jump back into the GUI thread via the syncContext, you can use task.ConfigureAwait(false) on the task returned from the async function, making the threads to be taken from the pool. (thus getting console app behavior inside a GUI app)
GeneralRe: Message Loop?memberDewey1 Feb '13 - 12:49 
You could better understand this is you had programmed windows in C/C++.
 
There is a message loop, and the GUI is "Painted" in the WM_PAINT message!
 
Microsoft created these other mechanisms to hide that fact and deliver a different programming model to WinForms, etc.
 
Underneath, it's still a message loop with an HWND.
QuestionA small error.memberPaulo Zemek28 Jan '13 - 4:33 
You wrote this: It is moved to heap when passed by ref to the AwaitUnsafeOnCompleted function.
 
In fact, when it is passed as ref, that means it is not moved. A *pointer* to its actual address memory is sent, keeping it where it already is (in this case, the stack).
 
Maybe there is a copy somewhere, but using the ref keyword is exactly the opposite to moving it to the heap, it wants to use the struct where it is at the moment.

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web02 | 2.6.130523.1 | Last Updated 28 Jan 2013
Article Copyright 2013 by Amit Bezalel
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid