01 July 2013 by ChrisHurley

The support for asynchronous operations in .NET 4.5 has made it much easier to create easily-intelligible asynchronous methods that avoid blocking. However, async/await isn't cost-free in terms of CPU overhead. How best to judge when to use it? Chris Hurley explains.

Async/await is great for avoiding blocking while potentially time-consuming work is performed in a .NET application, but there are overheads associated with running an async method: the current execution context has to be captured, there is a thread transition, and a state machine is built through which your code runs. The cost of this is comparatively negligible when the asynchronous work takes a long time, but it’s worth keeping in mind.

Support for the async and await contextual keywords is one of the most convenient new features in .NET 4.5. It’s always been possible to write asynchronous code, of course, but async/await allows it to be written in a relatively straightforward manner which neatly expresses the intention of the code, and means that it isn’t necessary to write separate continuation methods. As long as you have a Task (or anything else that implements the Awaitable pattern) that you can await on, the compiler can automatically set up the environment on which you can wait for it to complete, and then continue execution once the work is done, all without blocking the calling thread unnecessarily.

In order to provide a responsive and smooth interface, particularly on touch and gesture devices, it is particularly important  to avoiding blocking the UI thread. This was a central focus for Microsoft during the development of the WinRT API, and they ensured that any APIs that may take longer than 50ms to execute would only be available in an asynchronous form.

Of course, you could use async/await regardless of the amount of time that the method call is likely to take. However, the ease with which it’s possible to make an operation asynchronous in your code hides the work that’s being done behind the scenes. As soon as the compiler sees the async keyword next to a method, it replaces your method with the async state machine. If you write a simple method that looks like:

private async void AsyncMethod()
{
InitialWork();
await Task.Run(() => DoAsyncWork());
FinalWork();
}

… then the compiler generates the following (obtained by setting .NET Reflector to .NET 4.0 mode, so it doesn’t attempt to understand the async implementation):

private void AsyncMethod()
{
<AsyncMethod>d__16 d__;
d__.<>4__this = this;
d__.<>t__builder = AsyncVoidMethodBuilder.Create();
d__.<>1__state = -1;
d__.<>t__builder.Start<<AsyncMethod>d__16>(ref d__);
}

Calling the method now requires creating a state machine and building a Task to contain the work that goes on within it: none of the code in the original method is referenced here. Setting it all up the first time is a relatively complex operation ( Figure 1):

Figure 1: Framework methods required to initialize an example async method

Despite this async method being relatively simple, ANTS Performance Profiler shows that it’s caused over 900 framework methods to be run in order to initialize it and the work it does the first time that it’s run.

The largest proportion of these methods is made up of those involved in starting a new Task in which to do the asynchronous work, due to the call to Task.Run (Figure 2). This is not inherently due to the use of async/await, but it should be noted that moving the asynchronous work onto another thread in some way like this is required if the original thread is to be unblocked: otherwise, the work is done synchronously, despite the use of theasync/await keywords. Even if the method never hits an await statement or starts a new Task, there is still overhead, as building the async method involves getting the execution context and synchronization context and therefore examining the stack. Fortunately, the context is cached, and so the overhead on subsequent calls is much lower.

Figure 2: The framework methods include those called by Task.Run and several System.Runtime.CompilerServices methods

The synchronization context is necessary to ensure that the continuation code after await statements is called in the same context as the original code. This is important if, for example, the method was originally called from the UI thread and will update the UI when the asynchronous task is complete, but is not always necessary or desirable. Calling Task.ConfigureAwait(false) prevents the restoration of the synchronization context, and should be used when it is not required to return to the original context.

So, given that the compiler has replaced the original contents of the AsyncMethod() method, where did it move it to? It’s ended up in the MoveNext() method of the state machine. For example, after doing some initial set-up, it runs InitialWork():

private void MoveNext()
{
try
{
TaskAwaiter awaiter;
bool flag = true;
if (this.<>1__state != 0)
{
this.<>4__this.InitialWork();
[…]

Disabling Async Mode in ANTS Performance Profiler 8 exposes this implementation detail:

In this example, the initial work was done on the originating thread, but switched over to a thread-pool thread in order to do the async work (see Figure 3). Execution then returns to the state machine, which moves on and executes the final part of the method. The more await statements there are, the more movements through the state machine are required.

Figure 3: Disabling Async Mode in ANTS Performance Profiler 8 shows the internal MoveNext methods and the switch to the thread pool

So what is the overhead of all this initialization, and how much persists on subsequent calls? Here, I’ve set up a simple WPF application to initiate some synchronous and asynchronous calls in response to button clicks. The potentially-asynchronous method, DoAsyncWork(), returns in 1ms. The first call to this method takes just over 1ms when called synchronously (Figure 4):

Figure 4: ANTS Performance Profiler 8 results for initial synchronous run of an example method

However, when going through the async state machine, the total time for the task to complete is over 70ms. Indeed, it takes 45ms just to get to the await statement, at which point the calling thread is unblocked (Figure 5):

Figure 5: Async Mode results for the first async run of an example method, where the Total time column shows the total time required for the async method to complete

There’s a lot of initialization happening here, and fortunately the overhead is much lower on subsequent runs, as we’ll see in the next example. When these methods are called 1000 times in a loop, the synchronous calls complete in barely any more time than the 1000ms the work itself would take (Figure 6):

Figure 6: Results after running an example method synchronously in a loop 1000 times

Once an async method is called in a loop to call the same function, however, the total time increases due to the additional overhead. In this particular example, involving both the use of async and scheduling tasks to the thread pool, this increase is around 150ms over the 1000ms duration for the work itself, after running the method once to prevent JIT and thread pool initialization overhead (Figure 7). That’s an increase of around 15%.

Figure 7: Results after running the example method asynchronously in a loop 1000 times

The continued overhead of async/await and the dispatching of individual tasks to the thread pool is actually quite small given the amount of work that’s being done, and there’s certainly no reason not to use it for methods that are potentially slow, especially given the benefits of running such code asynchronously.  However, the overhead isn’t zero, so if you’re looking to maximize performance of frequently-called code you may want to avoid the use ofasync/await for very short methods, especially those called in a loop - instead, wrap the async code around potentially slow methods or larger units of work where the added overhead is negligible.

Conclusion

Avoid using async/await for very short methods or having await statements in tight loops (run the whole loop asynchronously instead). Microsoft recommends that any method that might take longer than 50ms to return should run asynchronously, so you may wish to use this figure to determine whether it’s worth using the async/await pattern.