C#7: Better Performance with Ref Locals, and Ref and Async Returns

Since the start of the year, we have been taking a look at the various new features coming to C# in C#7. This week, we will look at some of the changes to returning values from our functions that are there for those who need better performance. Before we begin, here is a summary of what we are covering in this series.

Generalized async return types

Up until now, an async method had to return Task, Task<T>, or void (though that last one is generally frowned upon1). However, returning Task or Task<T> can create performance bottlenecks as the reference type needs allocating. For C#7, we can now return other types from async methods, including the new ValueTask<T>, enabling us to have better control over these performance concerns. For more information, I recommend checking out the official documentation.

Ref Locals and Ref Returns

C#7 brings a variety of changes to how we get output from our methods; specifically, out variables, tuples, and ref locals and ref returns. I covered out variables in an earlier post and I will be covering tuples in the next one, so let's take a look at ref locals and ref returns. Like the changes to async return types, this feature is all about performance.

The addition of ref locals and ref returns enable algorithms that are more efficient by avoiding copying values, or performing dereferencing operations multiple times.2

Like many performance-related issues, it is difficult to come up with a simple real world example that is not entirely contrived. So, suspend your engineering minds for a moment and assume that this is a perfectly great solution for the problem at hand so that I can explain this feature to you. Imagine it is Halloween and we are counting how many pieces of candy we have collectively from all of our heaving bags of deliciousness3. We have several bags of candy with different candy types and we want to count them. So, for each bag, we group by candy type, then retrieve current count of each candy type, add the count of that type from the bag, and then store the new count.

This works, but it has overhead; we have to look up the candy count value in our dictionary multiple times when retrieving and setting the count. However, by using ref returns, we can create an alternative to our dictionary that minimises that overhead. In writing this example, I learned4 that since IDictionary does not do ref returns from its methods, we can't use it with ref locals directly. However, we also cannot use a local variable as we cannot return a reference to a value that does not live beyond the method call, so we must modify how we store our counts.

Now we are returning a reference to the actual stored value and changing it directly without repeated look-ups on our data type, making our algorithm perform better5. Be sure to check out the official documentation for alternative examples of usage.

Syntax Gotchas

Before we wrap this up, I want to take a moment to point out a few things about syntax. This feature uses the ref keyword a lot. You have to specify that the return type of a method is ref, that the return itself is a return ref, that the local variable storing the returned value is a ref, and that the method call is also ref. If you skip one of these uses of ref, the compiler will let you know, but as I discovered when writing the examples, the message is not particularly clear regarding how to fix it. Not only that, but you may get caught out when trying to consume by-reference returns as you can skip the two uses at the call-site (e.g. int count =_candyCounter.GetCount(candyName);); in such a case, the method call will be as if it were a regular, non-reference return; watch out.

In Conclusion

I doubt any of us will use these performance-related features much, if at all, and nor should we. In fact, I expect that their appearance will be a code smell in a majority of circumstances; a case of "ooo, shiny" usage6. I certainly think that the use of ref return with anything but value types will be highly unusual. That said, in those situations when every nanosecond of performance is required, these new C# additions will most definitely be invaluable.


  1. https://msdn.microsoft.com/en-us/magazine/jj991977.aspx 

  2. https://docs.microsoft.com/en-us/dotnet/articles/csharp/csharp-7#ref-locals-and-returns 

  3. Let's not focus on why we went trick or treating as adults and why anyone gave us candy in the first place 

  4. seems obvious now 

  5. In theory; after all, this is a convoluted example and I am sure there are better ways to improve performance than just using ref locals. Always measure your code to be sure you are making things better; don't guess 

  6. "Ooo, shiny" usage; when someone uses something just because it's new and they want to try it out 

The Need For Speed

Hopefully, those who are regular visitors to this blog1 have noticed a little speed boost of late. That is because I recently spent several days overhauling the appearance and performance with the intent of making the blog less frustrating and a little more professional. However, the outcome of my effort turned out to have other pleasant side effects.

I approached the performance issues as I would when developing software; I used data. In fact, it was data that drove me to look at it in the first place. Like many websites, this site uses Google Analytics, which allows me to poke around the usage of my site, see which of the many topics I have covered are of interest to people, what search terms bring people here (assuming people allow their search terms to be shared), and how the site is performing on various platforms and browsers. One day I happened to notice that my page load speeds, especially on mobile platforms, were pretty bad and that there appeared to be a direct correlation between the speed of pages loading and the likelihood that a visitor to the site would view more than one page before leaving2 . Thankfully, Google provides via their free PageSpeed Insights product, tips on how to improve the site. Armed with these tips, I set out to improve things.

Google PageSpeed Insights
Google PageSpeed Insights

Now, in hindsight, I wish I had been far more methodical and documented every step— it would have made for a great little series of blog entries or at least improved this one —but I did not, so instead, I want to summarise some of the tasks I undertook. Hopefully, this will be a useful overview for others who want to tackle performance on their own sites. The main changes I made can be organized into server configuration, site configuration, and content.

The simplest to resolve from a technical perspective was content, although it remains the last one to be completed mainly due to the time involved. It turns out that I got a little lazy when writing some of my original posts and did not compress images as much as I probably should have. The larger an image file is, the longer it takes to download, and this is only amplified by less powerful mobile devices. For new posts, I have been resolving this as I go by using a tool called PNGGauntlet to compress my images as either JPEG or PNG before uploading them to the site. Sadly, for images already uploaded to the site, I could only find plugins that ran on Apache (my installation of WordPress is on IIS for reasons that I might go into another time), would cost a small fortune to process all the images, or had reviews that implied the plugin might work great or might just corrupt my entire blog. I decided that for now, to leave things as they are and update images manually when I get the opportunity. This means, unfortunately, it will take a while. Thankfully, the server configuration options helped me out a little.

On the server side, there were two things that helped. The first, to ensure that the server compressed content before sending it to the web browser, did not help with the images, but it did greatly reduce the size of the various text files (HTML, CSS, and JavaScript) that get downloaded to render the site. However, the second change made a huge difference for repeat visitors. This was to make sure that the server told the browser how long it could cache content for before it needed to be downloaded again. Doing this ensured that repeat visitors to the site would not need to download all the CSS, JS, images, and other assets on every visit.

With the content and the server configuration modified to improve performance, the next and most important focus was the WordPress site itself. The biggest change was to introduce caching. WordPress generates HTML from PHP code. This takes time, so by caching the HTML it produces, the speed at which pages are available for visitors is greatly increased. A lot of caching solutions for WordPress are developed with Apache deployments in mind. Thankfully, I found that with some special IIS-specific tweaking, WP Super Cache works great3 .

At this point, the site was noticeably quicker and almost all the PageSpeed issues were eliminated. To finish off the rest, I added a few plugins and got rid of one as well. I used the Autoptimize plugin to concatenate, minify, compress, and perform other magic on the HTML, CSS, and JS files (this improved download times just a touch more by reducing the number of files the browser must request, and reducing the size of those files), I added JavaScript to Footer, a plugin that moves JavaScript to after the fold so that the content appears before the JavaScript is loaded, I updated the ad code (from Google) to use their latest asynchronous version, and I removed the social media plugin I was using, which was not only causing poor performance but was also doing some nasty things with cookies.

Along this journey of optimizing my site, I also took the opportunity to tidy up the layout, audit the cookies that are used, improve the way advertisers can target my ads, and add a sitemap generator to improve some of the ways Google (and other search engines) can crawl the site4. In all, it took about five days to get everything up and running in my spare time.

So, was it worth it?

Before and after
Before and after

From my perspective, it was definitely worth it (please let me know your perspective in the comments). The image above shows the average page load, server response, and page download times before the changes (from January through April – top row) and after the changes (June – bottom row). While the page download time has only decreased slightly, the other changes show a large improvement. Though I cannot tell for certain what changes were specifically responsible (nor what role, if any, the posts I have been writing have played5 ), I have not only seen the speed improve, but I have also seen roughly a 50-70% increase in visitors (especially from Russia, for some reason), a three-fold increase in ad revenue6, and a small decrease in Bounce Rate, among other changes.

I highly recommend taking the time to look at performance for your own blog. While there are still things that, if addressed, could improve mine (such as hosting on a dedicated server), and there are some things PageSpeed suggested to fix that are outside of my control, I am very pleased with where I am right now. As so many times in my life before, this has led me to the inevitable thought, "what if I had done this sooner?"


  1. hopefully, there are regular visitors 

  2. The percentage of visitors that leave after viewing only one page is known as the Bounce Rate 

  3. Provided you don't do things like enable compressing in WP Super Cache and IIS at the same time, for example. This took me a while to understand but the browser is only going to strip away one layer of that compression, so all it sees is garbled nonsense. 

  4. Some of these things I might blog about another time if there is interest (the cookie audit was an interesting journey of its own). 

  5. though I possibly could with some deeper use of Google Analytics 

  6. If that is sustained, I will be able to pay for the hosting of my blog from ad revenue for the first time 

LINQ: Deferred Execution

This is part of a short series on the basics of LINQ:

In the first rant post of this short series on LINQ, I explained the motivation behind writing this series in the first place, which can be summarised as:

People don't know LINQ and that impacts my ability to make use of it; I should try to fix that.

To start, I'm going to explain what I believe is the most important concept in LINQ; deferred execution.

So, what is deferred execution?

Deferred execution code is not executed until the result is needed; the execution is put off (deferred) until later. By doing this we can combine a series of actions without actually executing any of them, then execute them at the time we need a result. This allows us to limit the execution of computationally expensive operations until we absolutely need them.

That's my description, here is one from an MSDN tutorial on LINQ-to-XML that perhaps puts it more clearly:

Deferred execution means that the evaluation of an expression is delayed until its realized value is actually required. Deferred execution can greatly improve performance when you have to manipulate large data collections, especially in programs that contain a series of chained queries or manipulations. In the best case, deferred execution enables only a single iteration through the source collection.

Some may be surprised to know that deferred execution was not new when LINQ arrived, it had already been around for quite some time in the form of iterator methods. In fact, it is iterator methods that give LINQ its deferred execution. Before we look at an iterator method, let's look at an example of immediate execution. For this example, we will give ourselves the task of taking a collection of people and outputting a collection of unique last names for all those born before 1980.

When this method is called, it iterates over the entire collection and then returns its result, which also can then be iterated. If the collection of people were huge and we only cared about the first five names, this would be incredibly slow. To turn this into deferred execution, we can write it like this:

Now, none of the code in this method is executed until the first time something calls MoveNext() on the returned enumerable. This means we could take the first five names without processing the entire collection of people, giving us a potentially enormous performance gain. If each item in the generated collection were computationally expensive to produce, without lazy evaluation, that expense would be multiplied by the total number of items in the collection on every call to the generator method; however, with lazy evaluation, the consumer of the collection gets to decide how many items are computed and therefore, how much work gets done. This ability to defer and control computationally expensive operations is the power of deferred execution.

However, not every deferred action necessarily has low overhead. Deferred execution actually comes in two flavours; eager evaluation and lazy evaluation (the example above is an example of lazy evaluation)1. Every action in a deferred execution chain uses either lazy or eager evaluation. Though lazy evaluation is preferred, sometimes it is not possible to evaluate one item at a time, such as when sorting. Eagerly evaluated deferred execution allows us to at least defer the effort until we want it done.

An eagerly evaluated version of the iterator method we have been looking at might look like this:

In this example, because it is still an iterator method (it returns its results using yield return), none of the code is executed until the very first time the MoveNext() method is called on the returned enumerable, and therefore, the execution is deferred. When MoveNext() gets called for the first time, the entire collection of data is processed at once and then the results are output one by one as needed. The difference between this and the immediate execution equivalent we first looked at is that in this version, no work is done until a result is demanded.

Allowing the consumer of a collection to control how much work is done rather than work being dictated by the collection generator allows us to manage data more efficiently by building chains of operations and then processing the result in one go when needed. Lazy evaluation gives us the additional ability to spread the effort across each call to MoveNext(). The key to writing good LINQ is understanding which actions are immediate, which are deferred and lazily evaluated, which are deferred and eagerly evaluated, and why it matters. We will take a look at that next time.


  1. Quite often, people use 'deferred execution' and 'lazy evaluation' interchangeably, but they are not actually synonymous, nor is 'immediate execution' synonymous with 'eager evaluation'. 

Debugging IIS Express website from a HyperV Virtual Machine

Recently, I had to investigate a performance bug on a website when using Internet Explorer 8. Although we are fortunate to have access to BrowserStack for testing, I have not found it particularly efficient for performance investigations, so instead I used an HyperV virtual machine (VM) from modern.IE.

I had started the site under test from Visual Studio 2013 using IIS Express. Unfortunately, HyperV VMs are not able to see such a site out-of-the-box. Three things must be reconfigured first: the VM network adapter, the Windows Firewall of the host machine, and IIS Express.

HyperV VM Network Adapter

HyperV Virtual Switch Manager
HyperV Virtual Switch Manager

In HyperV, select Virtual Switch Manager… from the Actions list on the right-hand side. In the dialog that appears, select New virtual network switch on the left, then Internal on the right, then click Create Virtual Switch. This creates a virtual network switch that allows your VM to see your local machine and vice versa. You can then name the switch anything you want; I called mine LocalDebugNet.

New virtual network switch
New virtual network switch

To ensure the VM uses the newly created virtual switch, select the VM and choose Settings… (either from the context menu or the lower-right pane). Choose Add Hardware in the left-hand pane and add a new Network Adapter, then drop down the virtual switch list on the right, choose the switch you created earlier, and click OK to accept the changes and close the dialog.

Add network adapter
Add network adapter
Set virtual switch on network adapter
Set virtual switch on network adapter

Now the VM is setup and should be able to see its host machine on its network. Unfortunately, it still cannot see the website under test. Next, we have to configure IIS Express.

IIS Express

Open up a command prompt on your machine (the host machine, not the VM) and run ipconfig /all . Look in the output for the virtual switch that you created earlier and write down the corresponding IP address1.

Command prompt showing ipconfig
Command prompt showing ipconfig

Open the IIS Express applicationhost.config file in your favourite text editor. This file is usually found under your user profile.

Find the website that you are testing and add a binding for the IP address you wrote down earlier and the port that the site is running on. You can usually just copy the localhost binding and change localhost to the IP address or your machine name.

You will also need to run this command as an administrator to add an http access rule, where <ipaddress>  should be replaced with the IP you wrote down or your machine name, and <port>  should be replaced with the port on which IIS Express hosts your website.

At this point, you might be in luck. Try restarting IIS Express and navigating to your site from inside the HyperV VM. If it works, you are all set; if not, you will need to add a rule to the Windows Firewall (or whatever firewall software you have running).

Windows Firewall

The VM can see your machine and IIS Express is binding to the appropriate IP address and port, but the firewall is preventing traffic on that port. To fix this, we can add an inbound firewall rule. To do this, open up Windows Firewall from Control Panel and click Advanced Settings or search Windows for Windows Firewall with Advanced Security and launch that.

Inbound rules in Windows Firewall
Inbound rules in Windows Firewall

Select Inbound Rules on the left, then New Rule… on the right and set up a new rule to allow connections the port where your site is hosted by IIS Express. I have shown an example here in the following screen grabs, but use your own discretion and make sure not to give too much access to your machine.

New inbound port rule
New inbound port rule
Specifying rule port
Specifying rule port
Setting rule to allow the connection
Setting rule to allow the connection
Inbound rule application
Inbound rule application
Naming the rule
Naming the rule

Once you have set up a rule to allow access via the appropriate port, you should be able to see your IIS Express hosted site from inside your VM of choice.

As always, if you have any feedback, please leave a comment.


  1. You can also try using the name of your machine for the following steps instead of the IP