C#7: Better Performance with Ref Locals, and Ref and Async Returns

Since the start of the year, we have been taking a look at the various new features coming to C# in C#7. This week, we will look at some of the changes to returning values from our functions that are there for those who need better performance. Before we begin, here is a summary of what we are covering in this series.

Generalized async return types

Up until now, an async method had to return Task, Task, or void (though that last one is generally frowned upon1). However, returning Task or Task can create performance bottlenecks as the reference type needs allocating. For C#7, we can now return other types from async methods, including the new ValueTask, enabling us to have better control over these performance concerns. For more information, I recommend checking out the official documentation.

Ref Locals and Ref Returns

C#7 brings a variety of changes to how we get output from our methods; specifically, out variables, tuples, and ref locals and ref returns. I covered out variables in an earlier post and I will be covering tuples in the next one, so let's take a look at ref locals and ref returns. Like the changes to async return types, this feature is all about performance.

The addition of ref locals and ref returns enable algorithms that are more efficient by avoiding copying values, or performing dereferencing operations multiple times.2

Like many performance-related issues, it is difficult to come up with a simple real world example that is not entirely contrived. So, suspend your engineering minds for a moment and assume that this is a perfectly great solution for the problem at hand so that I can explain this feature to you. Imagine it is Halloween and we are counting how many pieces of candy we have collectively from all of our heaving bags of deliciousness3. We have several bags of candy with different candy types and we want to count them. So, for each bag, we group by candy type, then retrieve current count of each candy type, add the count of that type from the bag, and then store the new count.

void CountCandyInBag(IEnumerable<string> bagOfCandy)
{
    var candyByType = from item in bagOfCandy
                      group item by item;

    foreach (var candyType in candyByType)
    {
        var count = _candyCounter.GetCount(candyType.Key);
        _candyCounter.SetCount(candyType.Key, count + candyType.Count());
    }
}

class CandyCounter
{
    private readonly Dictionary<string, int> _candyCounts = new Dictionary<string, int>();

    public int GetCount(string candyName)
    {
        if (_candyCounts.TryGetValue(candyName, out int count))
        {
            return count;
        }
        else
        {
            return 0;
        }
    }

    public void SetCount(string candyName, int newCount)
    {
        _candyCounts[candyName] = newCount;
    }
}

This works, but it has overhead; we have to look up the candy count value in our dictionary multiple times when retrieving and setting the count. However, by using ref returns, we can create an alternative to our dictionary that minimises that overhead. In writing this example, I learned4 that since IDictionary does not do ref returns from its methods, we can't use it with ref locals directly. However, we also cannot use a local variable as we cannot return a reference to a value that does not live beyond the method call, so we must modify how we store our counts.

void CountCandyInBag(IEnumerable<string> bagOfCandy)
{
    var candyByType = from item in bagOfCandy
                      group item by item;

    foreach (var candyType in candyByType)
    {
        ref int count = ref _candyCounter.GetCount(candyType.Key);
        count += candyType.Count();
    }
}

class CandyCounter
{
    private readonly Dictionary<string, int> _candyCountsLookup = new Dictionary<string, int>();
    private int[] _counts = new int[0];

    public ref int GetCount(string candyName)
    {
        if (_candyCountsLookup.TryGetValue(candyName, out int index))
        {
            return ref _counts[index];
        }
        else
        {
            int nextIndex = _counts.Length;
            Array.Resize(ref _counts, nextIndex + 1);
            _candyCountsLookup[candyName] = _counts.Length - 1;
            return ref _counts[nextIndex];
        }
    }
}

Now we are returning a reference to the actual stored value and changing it directly without repeated look-ups on our data type, making our algorithm perform better5. Be sure to check out the official documentation for alternative examples of usage.

Syntax Gotchas

Before we wrap this up, I want to take a moment to point out a few things about syntax. This feature uses the ref keyword a lot. You have to specify that the return type of a method is ref, that the return itself is a return ref, that the local variable storing the returned value is a ref, and that the method call is also ref. If you skip one of these uses of ref, the compiler will let you know, but as I discovered when writing the examples, the message is not particularly clear regarding how to fix it. Not only that, but you may get caught out when trying to consume by-reference returns as you can skip the two uses at the call-site (e.g. int count =
_candyCounter.GetCount(candyName);
); in such a case, the method call will be as if it were a regular, non-reference return; watch out.

In Conclusion

I doubt any of us will use these performance-related features much, if at all, and nor should we. In fact, I expect that their appearance will be a code smell in a majority of circumstances; a case of "ooo, shiny" usage6. I certainly think that the use of ref return with anything but value types will be highly unusual. That said, in those situations when every nanosecond of performance is required, these new C# additions will most definitely be invaluable.

  1. https://msdn.microsoft.com/en-us/magazine/jj991977.aspx []
  2. https://docs.microsoft.com/en-us/dotnet/articles/csharp/csharp-7#ref-locals-and-returns []
  3. Let's not focus on why we went trick or treating as adults and why anyone gave us candy in the first place []
  4. seems obvious now []
  5. In theory; after all, this is a convoluted example and I am sure there are better ways to improve performance than just using ref locals. Always measure your code to be sure you are making things better; don't guess []
  6. "Ooo, shiny" usage; when someone uses something just because it's new and they want to try it out []

Octokit and the Authenticated Access

Last week, I introduced Octokit and my plans to write a tool that will mine our GitHub repositories for information that can be used to craft release notes. This week, we will look at the first step; authentication. I am using Octokit.NET for my hackery; if you choose to use another variant of Octokit, some of the types and methods available may be different, but you should be able to follow along. In addition, I have no intention of documenting every aspect of Octokit and the GitHub API, so if you are intrigued by anything that I do not discuss, I encourage you to explore the relevant documentation.

The main `GitHubClient` class, used to access the GitHub APIs, has several constructors, some that take credentials (sort of) and some that do not. All but one of the constructors take a `ProductHeaderValue` instance, which provides some basic information about the application that is accessing the API. According to the documentation, this information is used by GitHub for analytics purposes and can be whatever you want.

Now, if you only want to read information about publicly accessible repositories, you do not need to provide any authentication at all. You can create a client instance and just get stuck in, like this:

var githubClient = new GitHubClient(new ProductHeaderValue("Tinkering"));
var repo = await githubClient.Repository.Get("octokit", "octokit.net" );
Console.WriteLine(repo.Name);

However, you can only perform some read-only tasks on public repositories and, unless you are performing the most trivial of tasks, you will hit rate limits for unauthenticated access.

NOTE: All of the Octokit.NET calls are awaitable

Authentication can be achieved in a several ways; via an implementation of `ICredentialStore` passed to a constructor of `GitHubClient`, by providing credentials to the `GitHubClient.Connection.Credentials` property, or by using the `GitHubClient.Oauth`. The `OAuth` API allows an application to authenticate without ever having access to a user's credentials; it is understandably a little more complex than approaches that just take credentials. Since, at this point, our focus is to craft some methods for extending the API functionality, we will worry about the `OAuth` workflow another time. The other two approaches are quite similar, although the constructor-based approach requires a little extra effort. The following two examples will both give you authenticated access, though I think the constructor-based access feels a little less hacky:

// Without the constructor
var githubClient = new GitHubClient(new ProductHeaderValue("tinkering"));
githubClient.Connection.Credentials = new Credentials("username", "password");
// With the constructor
public class CredentialsStore : ICredentialsStore
{
    public Task<Credentials> GetCredentials()
    {
        return Task.Run(() => new Credentials("username","password"));
    }
}

var githubClient = new GitHubClient(new ProductHeaderValue("tinkering"), new CredentialsStore());

Two-factor Authentication

Of course, using your username and password is futile because you have two-factor authentication enabled1. Luckily there is a constructor on the `Credentials` class that takes a token, which you can generate on GitHub.

First, log into your GitHub account and choose Settings from the drop-down at the upper-right. On the fight, select Personal Access Tokens.

The right-hand side will change to the list of personal access tokens you have already created for your account (you may have created these yourself or an application may have created them via OAuth). Click the Generate New Token button and give it a useful name. You can now use this token as your credentials when using Octokit. I keep my token in the LINQPad password manager2 so that I can reference it in my code using the name I gave it, like this:

Util.GetPassword("the.name.I.gave.my.oauth.token")

In conclusion…

And that is it for this week. In the next entry of this series on Octokit, we will start getting to grips with releases and some of the basic pieces for my release note utility library.

  1. If you do not, you should rectify that []
  2. The LINQPad password manager is available via the File menu in LINQPad []