C#7 – somewhat abstract

C#7: Tools

I have spent the first couple of months of 2017 learning about the new features in C#7. This would not have been possible without some tools to help me play around with the new language syntax and associated types. Since we have to wait a little longer until Visual Studio 2017 is released, I thought you might like to know what tools I have been using to tinker in all things C#7.

LINQPad Beta

Link: http://www.linqpad.net/Download.aspx#beta

While early releases of Visual Studio 2017 (scheduled for release on March 7th) support the language, I initially found the release candidate to be unstable and frustrating. Not only that, but it can be cumbersome to spin up a quick example using Visual Studio, so I turned to my trusty friend, LINQPad.

LINQPad Beta showing me my return is missing a ref

I cannot recommend LINQPad enough, it is a fantastic tool for prototyping, poking around data sources, and more besides, like tinkering with language features you don't yet understand. While LINQPad's current release only supports the C# language up to version 6¹, the beta release also supports C# version 7. Not only can you use the language, but with the fantastic analysis window, you can see how Roslyn breaks down each part of the code. If you want to get started quickly, easily play around with the cool new features, and have a powerful tool for digging deeper as the need arises, the LINQPad beta is the tool to get.

Visual Studio 2017 RC

Link: https://www.visualstudio.com/downloads/

Yes, I know I said it was unstable and frustrating, but that was before, way back in January. These days the RC is much, much better and with the release date set for March 7th, there was never a better time to install Visual Studio 2017 RC and get a head start on getting to know some of the new things it can do, including C# 7. Tuples are fun, but poking around with them in the debugger is funner.

OzCode

Link: https://oz-code.com/

It is no secret that I love OzCode, the magical debugging extension for Visual Studio. It is so well-known that they asked me to be part of their OzCode Magician community program. So, it should come as no surprise that I have been using OzCode in my exploration of C#7. As the Visual Studio 2017 RC has matured, the clever people over at CodeValue have been creating previews of OzCode version 3, including amazing LINQ debugging support. Recently, I got to try an internal build that included support for all the cool new things in C#7.

OzCode 3 will be released on March 7th, the same day as Visual Studio 2017.

Documentation

Link: https://docs.microsoft.com/dotnet/articles/csharp/csharp-7

Never underestimate the power of reading documentation, it is one of the best tools out there. For my C#7 posts, I relied heavily on the new docs.microsoft.com site, specifically the .NET articles on C#. Not only is this a fantastic resource, but it has built-in support for commenting on the documentation so that you can ask questions and contribute to their improvement.

In Conclusion

This is the entire list of tools I used for my C#7 investigations. Try them out and get an early start on C#7 fun before the March 7th release of all the C#7 goodness. Happy tinkering and if you stumble on any useful tools, please share in the comments!

It also supports SQL, F#, and VB [↩]

C#7: Tuples

Well, here we are; the final post in this series on the new C#7 features. Over the last six weeks we have covered all but one, including:

While pattern matching is exciting, local functions are useful, and some of the other enhancements are nice, I personally feel that I have saved the best for last, because this week it is time to look at tuples.

Tuples

Wait, don't we already have tuples in C#? I mean, we totally have a type called Tuple so it sure seems like it. In fact, we have at least one other tuple-type too, KeyValuePair<TKey,TValue>. However, these are not C# concepts, they are .NET types and they fall pretty short of being the kind of tuples we want. I did not know what was really wrong with them until I learned about the tuples we get with C#7.

It turns out, the old Tuple type is limited; the members have fixed names (Item1, Item2, etc.) which do not communicate their purpose, it is a reference type and as such can have a measurable impact on performance due to object allocations, and it has no syntactical support within the language¹. To support the new C#7 feature, there is a new tuple type that is more performant and better suited to the awesomeness that is C#7 tuples; it is a value-type called ValueTuple and it comes with some friends like TupleElementNamesAttribute to give us something wonderful.

If you want to play with C#7 tuples in Visual Studio 2017 RC, you'll need to grab the NuGet package so that the appropriate types to support tuples are available for the compiler. LINQPad Beta already includes everything you need.

Syntax

To define a new tuple in C#7, we use parentheses in an assignment like this:

var myTuple = ("John", "Smith", 45, ConsoleColor.DarkRed);

This creates a new tuple that looks an awful lot like something we might have created using Tuple.Create("John", "Smith", 45, ConsoleColor.DarkRed), since the values of the tuple are referenced via Item1, Item2, Item3, and Item4. That does not seem like a particular useful addition. However, we can make this tuple so much better with some simple changes to our assignment, like this:

var myTupleOfAwesomeness =
    ( FirstName: "John", LastName: "Smith", Age: 45, FavouriteConsoleColor: ConsoleColor.DarkRed);

Now, instead of the ItemX members, we can access our values using FirstName, LastName, Age, and FavouriteConsoleColor, as if we had instead created an anonymous type. Of course, the great thing is that we did not introduce a new type to do this; the C# compiler pulled some magic along with the new ValueTuple type² . Of course, it does not end there; we can take advantage of a new feature called deconstruction to name the fields on the left of the assignment instead:

(string FirstName, string LastName, int Age, ConsoleColor FavouriteConsoleColor) myTupleOfAwesomeness =
    ("John", "Smith", 45, ConsoleColor.DarkRed);

In fact, we can provide names on both sides although the ones provided on the left take precedent and you will get a compiler warning. However, being able to provide names via the right-hand side of an assignment is useful when a method returns a tuple and you want to provide some meaningful names locally. This is even more useful when you find out that you can drop the tuple variable name:

(string FirstName, string LastName, int Age, ConsoleColor FavouriteConsoleColor) =
    ( "John", "Smith", 45, ConsoleColor.DarkRed);

We basically just assigned values to four local variables that we can then use in the rest of our code, but if we had a method like this:

ValueTuple<string,string,int,ConsoleColor> GetPerson()
{
    return ("John", "Smith", 45, ConsoleColor.DarkRed);
}

We could do this on the right-hand side of that assignment:

(var firstName, var lastName, var age, var favouriteConsoleColor) = GetPersonInfo();

What we did there was deconstruct the returned tuple into four separate fields. Our method declares its return type as ValueType<string, string, int, ConsoleColor> and returns a tuple, then we assign that tuple to four variables by using the deconstruction syntax. In this example, I even switched to using var instead of explicitly typing the deconstruction. The var keyword can be used for all of the deconstructed fields, just a couple with the remainder strongly typed, or none; it's up to you and your code reviewers.

Deconstruction is going to change the way you write methods and some of your types, and will probably mean a lot less uses for out variables. That said, do not get carried away. Tuples are best for internal and private methods rather than public APIs where a static type is much more helpful to the consumer of your API.

The Deconstruct Method

If a method returns a tuple (e.g. return ("John", "Smith")), we can use deconstruction to assign its parts to different variables if we wish. However, deconstruction is not reserved for just tuples. We can also define how our own types can be deconstructed into separate variables. To do this, we define a Deconstruct method on our type that tells the compiler how to deconstruct it into separate values. For example:

public class Person
{
    public Person(string firstName, string lastName, int age, ConsoleColor favouriteConsoleColor)
    {
        FirstName = firstName;
        LastName = lastName;
        Age = age;
        FavouriteConsoleColor = favouriteConsoleColor;
    }
    
    public void Deconstruct(out string firstName, out string lastName, out int age, out ConsoleColor favouriteConsoleColor)
    {
        firstName = FirstName;
        lastName = LastName;
        age = Age;
        favouriteConsoleColor = FavouriteConsoleColor;
    }

    public string FirstName { get; }
    public string LastName { get; }
    public int Age { get; }
    public ConsoleColor FavouriteConsoleColor { get; }
}

With this type defined, we can use the deconstruction syntax with it just as we would with a ValueTuple:

var person = new Person("John", "Smith", 45, ConsoleColor.DarkRed);
(string firstName, string lastName, int age, ConsoleColor favouriteConsoleColor) = person;

I am not sure when I will use this ability to define deconstruction for my types, but I can imagine when the moment arises I will know. Also, you can see from its extensive use in the Deconstruct method, out is certainly not redundant.

Anonymous Types

new {a, b}

// -or-

(a, b)

Tuples in C#7 seem to out-do anonymous types in every way; tuples don't define a new type, they can be returned from methods without the overheads of object or dynamic, they support deconstruction, and they take up less typing; yet as far as I can tell they can be used everywhere that anonymous types can be used. It is likely that I am missing something important, but from my perspective thus far, I think tuples in C#7 not only kills off any new uses of the Tuple type, but also anonymous types.

In Conclusion

I really like this feature and the simplicity of its usage. It is light on extra syntax, adding the concept of a parenthetically-typed variable, but powerful in what it allows us to do with method returns and type deconstruction. I would like to give it some more thought and use it in more scenarios, but I also think tuples in C#7 will render anonymous types obsolete.

You can skim the overview of tuples in the What's new in C# 7 document, however, this is a simple feature with a complex background, so I highly recommend reading the more in-depth coverage on the Tuples official documentation, especially with regards to assignment of tuples with explicit member names³. It is a shame that it took two attempts to get tuples to be where they should be in C#, leaving yet another type⁴ by the wayside, but they are here at last and we will all be grateful.

What do you think? Are anonymous types relegated to legacy code only? Will Tuple.Create ever get used again? Is there something I should cover that is missing here or in the official documentation? Sound off in the comments and let me know.

C# Concepts: Tuples [↩]
Actually, like dynamic, the field names use an attribute to communicate the names of each tuple field [↩]
Hint: member names are not attached to the value, but instead stay with the variable for which they were declared [↩]
Tuple.Create may never get used again [↩]

C#7: Better Performance with Ref Locals, and Ref and Async Returns

Since the start of the year, we have been taking a look at the various new features coming to C# in C#7. This week, we will look at some of the changes to returning values from our functions that are there for those who need better performance. Before we begin, here is a summary of what we are covering in this series.

Generalized async return types

Up until now, an async method had to return Task, Task, or void (though that last one is generally frowned upon¹). However, returning Task or Task can create performance bottlenecks as the reference type needs allocating. For C#7, we can now return other types from async methods, including the new ValueTask, enabling us to have better control over these performance concerns. For more information, I recommend checking out the official documentation.

Ref Locals and Ref Returns

C#7 brings a variety of changes to how we get output from our methods; specifically, out variables, tuples, and ref locals and ref returns. I covered out variables in an earlier post and I will be covering tuples in the next one, so let's take a look at ref locals and ref returns. Like the changes to async return types, this feature is all about performance.

The addition of ref locals and ref returns enable algorithms that are more efficient by avoiding copying values, or performing dereferencing operations multiple times.²

Like many performance-related issues, it is difficult to come up with a simple real world example that is not entirely contrived. So, suspend your engineering minds for a moment and assume that this is a perfectly great solution for the problem at hand so that I can explain this feature to you. Imagine it is Halloween and we are counting how many pieces of candy we have collectively from all of our heaving bags of deliciousness³. We have several bags of candy with different candy types and we want to count them. So, for each bag, we group by candy type, then retrieve current count of each candy type, add the count of that type from the bag, and then store the new count.

void CountCandyInBag(IEnumerable<string> bagOfCandy)
{
    var candyByType = from item in bagOfCandy
                      group item by item;

    foreach (var candyType in candyByType)
    {
        var count = _candyCounter.GetCount(candyType.Key);
        _candyCounter.SetCount(candyType.Key, count + candyType.Count());
    }
}

class CandyCounter
{
    private readonly Dictionary<string, int> _candyCounts = new Dictionary<string, int>();

    public int GetCount(string candyName)
    {
        if (_candyCounts.TryGetValue(candyName, out int count))
        {
            return count;
        }
        else
        {
            return 0;
        }
    }

    public void SetCount(string candyName, int newCount)
    {
        _candyCounts[candyName] = newCount;
    }
}

This works, but it has overhead; we have to look up the candy count value in our dictionary multiple times when retrieving and setting the count. However, by using ref returns, we can create an alternative to our dictionary that minimises that overhead. In writing this example, I learned⁴ that since IDictionary does not do ref returns from its methods, we can't use it with ref locals directly. However, we also cannot use a local variable as we cannot return a reference to a value that does not live beyond the method call, so we must modify how we store our counts.

void CountCandyInBag(IEnumerable<string> bagOfCandy)
{
    var candyByType = from item in bagOfCandy
                      group item by item;

    foreach (var candyType in candyByType)
    {
        ref int count = ref _candyCounter.GetCount(candyType.Key);
        count += candyType.Count();
    }
}

class CandyCounter
{
    private readonly Dictionary<string, int> _candyCountsLookup = new Dictionary<string, int>();
    private int[] _counts = new int[0];

    public ref int GetCount(string candyName)
    {
        if (_candyCountsLookup.TryGetValue(candyName, out int index))
        {
            return ref _counts[index];
        }
        else
        {
            int nextIndex = _counts.Length;
            Array.Resize(ref _counts, nextIndex + 1);
            _candyCountsLookup[candyName] = _counts.Length - 1;
            return ref _counts[nextIndex];
        }
    }
}

Now we are returning a reference to the actual stored value and changing it directly without repeated look-ups on our data type, making our algorithm perform better⁵. Be sure to check out the official documentation for alternative examples of usage.

Syntax Gotchas

Before we wrap this up, I want to take a moment to point out a few things about syntax. This feature uses the ref keyword a lot. You have to specify that the return type of a method is ref, that the return itself is a return ref, that the local variable storing the returned value is a ref, and that the method call is also ref. If you skip one of these uses of ref, the compiler will let you know, but as I discovered when writing the examples, the message is not particularly clear regarding how to fix it. Not only that, but you may get caught out when trying to consume by-reference returns as you can skip the two uses at the call-site (e.g. int count = _candyCounter.GetCount(candyName);); in such a case, the method call will be as if it were a regular, non-reference return; watch out.

In Conclusion

I doubt any of us will use these performance-related features much, if at all, and nor should we. In fact, I expect that their appearance will be a code smell in a majority of circumstances; a case of "ooo, shiny" usage⁶. I certainly think that the use of ref return with anything but value types will be highly unusual. That said, in those situations when every nanosecond of performance is required, these new C# additions will most definitely be invaluable.

https://msdn.microsoft.com/en-us/magazine/jj991977.aspx [↩]
https://docs.microsoft.com/en-us/dotnet/articles/csharp/csharp-7#ref-locals-and-returns [↩]
Let's not focus on why we went trick or treating as adults and why anyone gave us candy in the first place [↩]
seems obvious now [↩]
In theory; after all, this is a convoluted example and I am sure there are better ways to improve performance than just using ref locals. Always measure your code to be sure you are making things better; don't guess [↩]
"Ooo, shiny" usage; when someone uses something just because it's new and they want to try it out [↩]

C#7: Local Functions

We have been racing through the C#7 features in my latest series of posts; here is a list of what I have covered and what will be covered:

In today's post, we will look at local functions. Those who are familiar with JavaScript will be familiar with local functions; the ability to define a method inside of another method. We have had a similar ability in C# since anonymous methods were introduced albeit in a slightly less flexible form¹. Up until C#7, methods defined within another method were assigned to variables and were limited in use and content. For example, one could not use yield inside anonymous methods.

Local functions allow us to declare a method within the scope of another method; not as an assignment to a run-time variable as with anonymous methods and lambda expressions, but as a compile-time symbol to be referenced locally by its parent function. With lambda expressions, the compiler does some heavy lifting for us, creating an anonymous type to hold our method and its closures; also, to call them, a delegate must be instantiated and then invoked, which incurs additional memory overhead to standard method calls: none of this is necessary for local functions as they are regular methods. Not only that, but because these are regular methods, the full range of method syntax is available to us, including yield.

The official documentation provides a good overview of the differences between anonymous methods/lambda expressions and local functions, which ends with this paragraph:

While local functions may seem redundant to lambda expressions, they actually serve different purposes and have different uses. Local functions are more efficient for the case when you want to write a function that is called only from the context of another method.

The last sentence of that infers one of the most useful things about local methods; streaming enumerable sequences with fail early support. The yield syntax introduced in C#3 has been incredibly useful, simplifying the work necessary for defining an enumerator from writing an entire class to just writing a method. However, due to the way enumeration works, we often have to split our enumerator methods into two so that things like argument validation occur immediately, rather than when the first item in the sequence is requested, like this:

public static IEnumerable<int> Numbers(int count)
{
    if (count <= 0) throw new ArgumentException();

    return NumbersImpl(count);
}

private static IEnumerable<int> NumbersImpl(int count)
{
    for (int i = 0; i < count; i++)
    {
        yield return i;
    }
}

The NumbersImpl method is only ever used by the public-facing Numbers method, but we cannot make that any clearer. However, with C#7 and local functions, we can now embed that method declaration into the Numbers method and make our code explicit.

public static IEnumerable<int> Numbers(int count)
{
    if (count <= 0) throw new ArgumentException();
    
    return NumbersImpl();

    IEnumerable<int> NumbersImpl()
    {
        for (int i = 0; i < count; i++)
        {
            yield return i;
        }
    }
}

There are a couple of things to note here. First, we can declare the local function after it has been called; unlike anonymous methods and lambda expressions, local functions are just like any other method; they are part of the program declaration rather than its execution. Second, and somewhat surprisingly², we can use closures.

Closures in Local Functions

With lambda expressions and anonymous methods, closures are hoisted into member variables of an anonymous type, but this is not how local functions work; local functions do not necessarily get their own anonymous type³. So how do closures work in local functions? Well, the compiler performs a little code-rewriting for us to effectively hoist the closures into method arguments so that we don't have to repeat ourselves.

In Conclusion

Local functions provide a valuable alternative to anonymous methods and lambda expressions, and one-time use member functions. Not only do they make a clear distinction between run-time and compile-time, making the intent clear, but also by co-locating a single-use method with the code that calls it, we can make our code more readable. Of course, as with many other features of high-level languages like C#, this could be abused and make code horribly unreadable, so keep each other honest and be sure to call out incorrect or dubious usage when you see it.

What do you think? Will you use this new feature of C#? Does it make the language better? Please leave your views in the comments. Next week, we'll take a look at some of the changes C#7 makes to returning values from our methods.

Lambda expressions are a terser syntax for anonymous methods [↩]
at least to me [↩]
Local functions that implement enumerators do get an anonymous type, but that's a special case [↩]

C#7: Pattern Matching

So far in this series on C#7, we have looked at some nice new things, including out variables, new expression-bodied members, throw expressions, and binary numeric literals. These are all great little additions, but this week, we get to a truly cool and long-awaited feature; pattern matching. But, before we get into that, here is the usual summary of what I am covering.

Pattern Matching

I will admit that I was a tad confused as to why this is called "pattern matching". I read the Wikipedia article¹. I expect I would understand this more if I were a Computer Science major instead of Computer Systems Engineering². According to Wikipedia:

In computer science, pattern matching is the act of checking a given sequence of tokens for the presence of the constituents of some pattern.

The What's New in C#7 documentation explains:

Pattern matching is a feature that allows you to implement method dispatch on properties other than the type of an object.

While I now understand the academic concept of pattern matching, I think we may regret using the term to specifically reference the features that are under its umbrella for C#7. I would have preferred it if these features had been introduced using already widely understood and consistent nomenclature. I think the official documentation agrees with me, since it almost immediately splits the feature into two pieces; is expressions and switch expressions. However, I am going to draw the line that divides the parts of pattern matching in a different place and call the two parts cast-conditional variables and case filters, because that fits better with my understanding of what they do.

Cast-conditional Variables

Many of us know the following code:

var x = y as MyType;
if (x != null)
{
   //Do the awesome!
}

This is the efficient way of verifying a value is a given type and then consuming it as that type since we cast just once³. However, with the new cast-conditional feature, we can condense this down to:

if (y is MyType x)
{
    //Do the awesome!
}

The cast and the check of its success have been condensed into a single, easily understood statement. We can use this cast-conditional variable syntax in any place where we might otherwise but a boolean expression:

// Example 1: Calculating a boolean
bool isDoubleNan = value is double y && Double.IsNaN(y);

// Example 2: Ternary operator
string filePath = size is int x && x > 10 ? "C:\Archive" : "C:\TinyFiles";

This feels like a great improvement to me and fits really well with the similar out variable feature, however it is only a new way of writing something we could already do. When extended to the cases in switch statements (albeit with a slightly different syntax), this gives us a brand new ability; switching based on variable type:

switch (myInterface)
{
case MyTypeA a:
    // Do something because we know this is of type, MyTypeA
    break;

case MyTypeB b:
    // Do something because we know this is of type, MyTypeB
    break;
}

Now you can iterate over that array of object values and have a nice, succinct way of processing the contents by type. Not only that, but with case filters, you can craft even finer conditions for your switches.

Case Filters

C#6 introduced the when keyword as a modifier to catch expressions so that we could finally utilize exception filters from C#. Now, the when keyword gets a similar job in switch statements as a modifier to case statements. Like in exception filters, the when expression is a condition that must be met for the case to be a match. For example, we could create an IsNumber method and use when to filter cases like Infinity and NaN:

bool IsNumber(object value)
{
    switch (value)
    {
    case int x:
    case float y when !float.IsNaN(y) && !float.IsInfinity(y):
    case double z when !double.IsNaN(z) && !double.IsInfinity(z):
        return true;

    default:
        return false;
    }
}

Prior to C#7, this code would look something like this:

bool IsNumberOld(object value)
{
    int? x = value as int?;
    if (x.HasValue) return true;
    
    float? y = value as float?;
    if (y.HasValue && !float.IsNaN(y.Value) && !float.IsInfinity(y.Value)) return true;

    double? z = value as double?;
    if (z.HasValue && !double.IsNaN(z.Value) && !double.IsInfinity(z.Value)) return true;

    return false;
}

Not only was there more typing before C#7, but I think the code was more repetitive and harder to scan. This may be a convoluted example, but I hope it illustrates how valuable this new language syntax can be.

In Conclusion

Naming disagreements aside, the new pattern matching in C#7 is powerful. With that power comes responsibility; the responsibility to use it wisely and to call out others who do not, for there is surely great room for abuse with this feature. I envisage frequent and appropriate use of cast-conditional variables in if statements since the scenarios to which that caters are widespread. However, the filtering added to switch statements brings something entirely new and so, I do not see it being as widely adopted not as appropriately used; time will tell.

Overall, I love this addition to C#7 even though I do not like the name. What do you think? Does "pattern matching" make sense or should it be something like "cast-conditional variables" and "case filters" instead? Will you use this feature a lot? When might you find pattern matching useful? Sound off in the comments and discuss.

that's a lie; I read parts of it until I came to the conclusion that it was not helping [↩]
Is this a theory versus practice issue again? I face those often in this field [↩]
Using the is condition and then casting inside the if statement would cause two casts; one for the is and another for the cast [↩]