Syntax – Page 2 – somewhat abstract

C#7: Out Variables

Last time, we started to look at the new features introduced in C#7. Here is a quick refresher of just what those features are:

In this post, we will look at one of the simplest additions to the C# language; out variables.

int dummy

How often have you written code like this?

int dummy;

if (int.TryParse(someString, out dummy) && dummy > 0)
{
   // Do something
}

Or this?

double dummy;

if (myDictionary.TryGetValue(key, out dummy))
{
   //Do something
}

Sometimes you use the out value retrieved, sometimes you do not, often you only use it within the scope of the condition. In any case, there is always the variable definition awkwardly hanging out on its own line, looking more important than it really is and leaving space for it to accidentally get used before it has been initialized. Thankfully, C#7 helps us tidy things up by allowing us to combine the variable definition with the argument.

Using the out variable syntax, we can write this:

if (int.TryParse(someString, out int dummy) && dummy > 0)
{
    //Do something
}

In fact, we do not even need to declare the type of the variable explicitly. While often we want to be explicit to make it clear that it matters (and to ensure we get some compile time checking of our assumptions), we can use an implicitly typed variable like this:

if (myDictionary.TryGetValue(someKey, out var dummy))
{
    //Do something
}

In Conclusion

out variables are part of a wider set of features for reducing repetition (in written code and in run-time execution), and saying more with less (i.e. making it easier for us to infer intent from the code without additional commentary). This is a very simply addition to C# syntax, yet useful. Not only does it reduce what we need to type, it also improves code clarity (in my opinion), and reduces the possibility of silly errors like using a variable before it has been initialized, or worse, thinking that it being uninitialized was a mistake and hiding a bug by initializing it.

Until next time, if you would like to tinker with any of the C#7 features I have been covering, I recommend getting the latest LINQPad beta or Visual Studio 2017 RC.

C#7: Binary Literals and Numeric Literal Digit Separators

Happy New Year, y'all! I thought I would kick off 2017 with a look at C#7. The next release of Visual Studio will soon be upon us and with it a new version of C#. As with its predecessor, C#6, C#7 brings a variety of syntactical and compiler magic allowing us to do more work with less code. Just as the new features of C#6 enabled us to make code more readable by reducing ceremony and making intent clearer¹, so go the new features of C#7.

Before we take a look closer look, here is an overview of the goodies in C#7:

It is a shorter list than the new features for C#6, but there is still a lot of goodness crammed in there. Over the next few posts, I want to delve into these features just a literal to familiarize myself (and you) with them and how they may impact the code we write. So, without further ado, let's take a look at the first two items on the list; binary literals and numeric literal digit separators.

Binary Literals

Numeric literals are not a new concept in C#. We have been able to define integer values in base-10 and base-16 since C# was first released. Common uses case for base-16 (also known as hexadecimal) literals are to define flags and bit masks in enumerations and constants. Since each digit in a base-16 number is 4 bits wide, each bit in that digit is represented by 1, 2, 4, and 8.

[Flags]
public enum Option
{
    None    = 0x00,
    Option1 = 0x01,
    Option2 = 0x02,
    Option3 = 0x04,
    Option4 = 0x08,
    Option5 = 0x10,
    Option6 = 0x20,
    Option7 = 0x40,
    Option8 = 0x80,
    All     = 0xFF
}

While this is familiar to most, using C#7 we can now express such things explicitly in base-2, more commonly referred to as binary. While hexadecimal literals are prefixed with 0x , binary literals are prefixed with 0b.

[Flags]
public enum Option
{
    None    = 0b00000000,
    Option1 = 0b00000001,
    Option2 = 0b00000010,
    Option3 = 0b00000100,
    Option4 = 0b00001000,
    Option5 = 0b00010000,
    Option6 = 0b00100000,
    Option7 = 0b01000000,
    Option8 = 0b10000000,
    All     = 0b11111111
}

Although I am used to using base-16 numbers for this, I can see value in being explicit by using binary literals. The strength comes when more than one bit is set. When using base-16, it can be easy to make a mistake and it is not immediately obvious what bits are set by a specific value². With binary literals, it is immediately obvious without additional, potentially erroneous side calculations.

Digit Separators

Of course, binary values can get big fast and keeping track of which things line up with which can be fraught with problems. Sure, we can try to line up the values, but what if the indentation gets one space off? Will we really notice during that code review?

To help with readability like this and to assist in avoiding silly off-by-one issues that can arise due to misaligned values, C#7 introduces _ as a digit separator for all numeric literals. This separator is stripped out by the compiler; it is just syntactical candy to aid readability and serves no purpose within the compiled code. For example, our enumeration above that uses binary literals can be rewritten as follows:

[Flags]
public enum Option
{
    None    = 0b0000_0000,
    Option1 = 0b0000_0001,
    Option2 = 0b0000_0010,
    Option3 = 0b0000_0100,
    Option4 = 0b0000_1000,
    Option5 = 0b0001_0000,
    Option6 = 0b0010_0000,
    Option7 = 0b0100_0000,
    Option8 = 0b1000_0000,
    All     = 0b1111_1111
}

I think this really does help with readability although I was disappointed to find that I could not use this separator directly after the base modifier. I do not know about anyone else, but it seems more readable to separate the modifier from the actual value. Thankfully, we can pad the left of our number with zeroes as long as the value we define fits into the type we are assigning.

byte a = 0b_0000_0001;  //INVALID: Digit separator cannot be at the start or end of the value
byte b = 0b1_0000_0001; //INVALID: 257 doesn't fit in a byte
byte c = 0b0_0000_0001; //VALID: 1 fits into a byte just fine

I suspect we may start seeing code that uses this "padding plus separator" approach once C#7 gets wider acceptance as I think it really improves readability; 0b0_0001_0000 is clearer to me than 0b0001_0000.

In addition, the digit separator is not limited to just binary numeric literals; it can be used in any numeric literal. For example, use it to separate 32-bit parts of a large hexadecimal number, or as a thousands separator in a floating point value; anywhere that it improves readability.

In Conclusion

The new binary literal syntax and digit separator should help to make intent clearer and code easier to read when used appropriately. As with any language feature, we must always use our best judgement to ensure it is being used appropriately. For more information on the features covered in this post, see the official documentation where you can also discover other C#7 magic that I will be covering in my upcoming posts.

Things like read-only auto-properties, expression-bodied member functions, exception filters, null-conditional operators, and the nameof operator to name a few [↩]
I know some can see in hex, and that's great, but not everyone is so adept [↩]

C#6: Exception Filters

The for the last six¹ releases the C# compiler has been keeping part of the .NET Framework secret from us²; exception filters. It turns out that the .NET Framework has supported exception filters since the very beginning, there was just no way to express them using C# until now.

C#6 adds the when keyword for use in try/catch blocks to specify exception filters. An exception filter is a predicate method that takes the thrown exception and returns true when the exception should be caught or false when it should not. If the filter says the exception should not be caught, the underlying system can continue to throw it.

This allows us to reduce the complexity in our code as we can put multiple catch statements with different filtering rules in the same try/catch block. This gives a switch-style approach to exception handling that is supported at the lowest level, reducing the need to rethrow exceptions (or to remember the difference between throw and throw exceptionVar;)³.

Here is a try/catch block showing an example of exception filtering:

Func<ArgumentException,string,bool> filterParameterName = (e,s) => e.ParamName == s;
try
{
    CallSomething("param1", "param2", "param3", "param4");
}
catch (ArgumentException ex) when (ex.ParamName == "param1")
{
    Console.WriteLine("Filtered: param1");
}
catch (ArgumentException ex) when (filterParameterName(ex, "param2"))
{
    Console.WriteLine("Filtered: param2");
}
catch (ArgumentException ex)
{
    Console.WriteLine($"Unfiltered: {ex.ParamName}");
}

Before I continue, I must state that this is a completely contrived example for demonstrable purposes; your filters would probably act on more than just the value of a string, the two filters shown would use the same code, and the handling would involve different things in each catch⁴.

Now, some things to note. First, the parentheses around the when condition are mandatory; you don't need to remember this as the compiler and syntax highlighting will remind you. Second, the content of the when condition must evaluate to bool; you cannot specify a lambda expression here. I am certain most of you already assumed that, but for some reason, I felt like that should be possible. However, when is akin to if or while, so it makes sense that a lambda expression would not work.

The example above provides three different catch blocks for the exact same exception type, ArgumentException. Each filter is evaluated in the order specified, so, if CallSomething() threw an ArgumentException with ParamName set to param2, the when condition on the first catch would reject it, but the second filter would catch it and handle accordingly. A ParamName value filtered out of the first two catch blocks would fall into the last.

In conclusion

Exception filtering is a useful and simple concept that should help to make exception handling easier to write. While some kind of filtering could be achieved before using conditions and throw inside of catch blocks, this language support now means that exception handlers (the content of catch blocks) have a single responsibility and the catch statements themselves are entirely responsible for declaring what must be caught. It also means that the exception handling within the .NET framework can be entirely responsible for routing exceptions in C#-implemented applications.

Exception filters have been supported by VB.NET and .NET-supporting variants of C++ since the versions released alongside .NET Framework 1.1; now, as of C#6, they are supported by C# too.

1.0, 1.2, 2.0, 3.0, 4.0, 5.0 [↩]
Actually, it's been keeping several, but we can't have everything [↩]
The first rethrows the original exception with the stack unchanged, the second throws a new exception and resets the stack [↩]
otherwise, why filter? [↩]

C#6: String Interpolation

Continuing the trend of my recent posts looking at the new features of C#6, this week I want to look at string interpolation.

Prior to C#6, string interpolation (or string formatting) was primarily the domain of the .NET framework and calls like `string.Format()` and `StringBuilder.AppendFormat()`¹ as in the following example:

var aString = "AString";
var formatString = string.Format("The string, '{0}', has {1} characters.", aString, aString.Length);

With string interpolation in C#6, this can be written as:

var aString = "AString";
var formatString = $"The string, {aString}, has a {aString.Length} characters.";

This is a little easier to read while also reducing what has to be typed to achieve the desired result. The contents of the braces are evaluated as strings and inserted into the resultant string. Under the hood, this example compiles down to the same `string.Format` call that was made in the earlier example. The same composite formatting power is there to specify things like significant figures and leading zeroes. If you need a culture-invariant string, there is a handy static method in the new `System.FormattableString` class called `Invariant()`. If you wrap your string with this `Invariant()` method, you will get the string formatted against the invariant culture.

Magic

Of course, to end the story there without discussing the compiler magic would do a disservice to this new feature. In the example above, the result of the interpolation was stored in a variable with type `var`. This means the type is inferred by the compiler, which infers `string` and then performs appropriate compiler operations to turn our interpolated string into a call to `string.Format()`. This means that we don't have to do anything else to use this feature and get formatted strings. However, we can make the compiler do something different by rewriting the line like this²:

FormattableString formatString = $"The string, {aString}, has a {aString.Length} characters.";

We have now specified that we are using a variable of type `FormattableString`. With this declaration, the compiler changes its behavior and we get a `FormattedString` object that represents the interpolated string. From this object, we can get the `Format` string that could be passed to a call that takes a format string, such as `string.Format()` (there are several others in types like `Console`, `StringBuilder`, and `TextWriter`). We can also retrieve the number of arguments³ in the string using `ArgumentCount`, and use `GetArgument()` and `GetArguments()` to retrieve the values of those arguments. Using a combination of `Format` and `GetArguments()`, we can pass this information to a different call that might reuse or extend it to produce a different message. Finally, we can use the `ToString()` call to specify an `IFormatProvider`, allowing us to format the string according to a specific culture.

By telling the compiler that we want a `FormattableString` we get all this extra information to use as we see fit. If you look at the arguments using either of the `Get..` methods, you will see that the values have already been evaluated, so you can be assured that they won't change as you process the string. I'm sure there are situations where you might find this additional access to the formatting invaluable, such as when creating compound error messages, or perhaps doing some automatic language translation.

In conclusion…

There's not much else for me to say about C#6's string interpolation except to highlight one gotcha that I have hit a couple of times. The next two examples should illustrate appropriately:

Console.WriteLine($"{DateTime.Now}: I'm writing DateTime.Now to the console");

Console.WriteLine("{DateTime.Now}: I'm writing DateTime.Now to the console");

Here is what these two examples will output:

9/8/2015 11:11:43 AM: I'm writing DateTime.Now to the console.

{DateTime.Now}: I'm writing DateTime.Now to the console.

It's hard to argue with either of them, after all, they both wrote an interpretation of `DateTime.Now` to the console, but the first one is perhaps a more useful output⁴.

So why did the second example not work? You may have already spotted the answer to that question, especially if you're a VB programmer; it's the `$` at the start of the first example's string. This `$` tells the compiler that we are providing a string for interpolation. It's an easy thing to miss and if you forget it (or perhaps, in rare cases, add it erroneously) you'll likely only spot the mistake through thorough testing⁵ or customer diligence⁶. As always, learn the failure points and work to mitigate them with code reviews and tests. I suspect the easiest mitigation may be to always use the interpolation style strings unless a situation demands otherwise.

And that's it for this week. What do you think of the new string interpolation support? Will you start using it? If not, why not? Do you have any cool ideas for leveraging the additional information provided by `FormattableString`? Please share in the comments.

If you're interested in my other posts on some of the new things introduced by C#6, here are links to posts I have written thus far:

The `+` operator can be used in conjunction with `ToString()` but it can get messy to read and is very hard to localize [↩]
We could also cast the interpolated string to `FormattableString` and leave the variable as `var`. [↩]
Each inserted value is an argument [↩]
Except when providing examples in a blog [↩]
Unit tests or otherwise [↩]
Write automated tests and test manually; let's not use customers as QA [↩]

C#6: Auto-property Initializers and Expression-Bodied Properties

Last week, I discussed the new null-conditional operators added in C#6. This week, I would like to discuss two features that are awesome but could lead to some confusion: auto-property initializers and expression-bodied properties¹.

Auto-initialized Properties

Before C#6, if we wanted to properly define an immutable property that had some expensive initialization, we had to do the following²:

public class MyClass
{
    public MyClass()
    {
        _immutableBackingField = System.Environment.CurrentDirectory;
    }

    public string ImmutableProperty
    {
        get
        {
            return _immutableBackingField;
        }
    }

    private readonly string _immutableBackingField;
}

Some people often use the shortcut of an auto-implemented property using the following syntax:

public class MyClass
{
    public MyClass()
    {
        ImmutableProperty = System.Environment.CurrentDirectory;
    }

    public string ImmutableProperty
    {
        get;
    }
}

However, defining properties like this means they are still mutable within the class (and its derivations). Using a backing field with the `readonly` keyword not only ensures that the property cannot be changed anywhere outside of the class construction, it also expresses exactly what you intended. Being as clear as possible is helpful for anyone who has to maintain the code in the future, including your future self.

From what I have read and heard, the main driver for using auto-implemented properties was writing less code. It somewhat saddens me when clarity of intent is replaced by speed of coding as we often pay for it later. Thankfully, both can now be achieved using initializers. Using this new feature, we can condense all that code down to just this:

class MyClass
{
    public int ImmutableProperty { get; } = System.Environment.CurrentDirectory;
}

It is a thing of beauty³. Behind the scenes, the compiler produces equivalent code to the first example with the `readonly` backing field.

Of course, this doesn't help much if you need to base your initialization on a value that is passed in via the constructor. Though a proposed feature for C#6, primary constructors, would have helped with this, it was pulled from the final release. Therefore, if you want to use construction parameters, you will still need a backing field of some kind. However, there is another feature that can help with this. That feature is expression-bodied properties⁴.

Expression-bodied Properties

An expression-bodied property looks like this:

class MyClass
{
    public int ImmutableProperty => 42;
}

This is equivalent to:

public class MyClass
{
    public int ImmutableProperty
    {
        get
        {
            return 42;
        }
    }
}

Using this lambda-esque syntax, we can provide more succinct implementations of our read-only properties. Consider this code:

public class MyClass
{
    public MyClass(string value)
    {
        _immutableBackingField = value;
    }

    public string ImmutableProperty
    {
        get
        {
            return _immutableBackingField;
        }
    }

    private readonly string _immutableBackingField;
}

Using expression-body syntax, we can write it as:

public class MyClass
{
    public MyClass(string value)
    {
        _immutableBackingField = value;
    }

    public string ImmutableProperty => _immutableBackingField;

    private readonly string _immutableBackingField;
}

But for the additional backing field declaration, this is almost as succinct as using an auto-implemented property. Hopefully, this new syntax will encourage people to make their intent clear rather than using the auto-implemented property shortcut when implementing immutable types.

Caveat Emptor

These new syntactical enhancements make property declaration not only easier to write, but in many common cases, easier to read. However, the similarities in these approaches can lead to some confusing, hard-to-spot bugs. Take this code as an example:

using System;

public class MyClass
{
    public string CurrentDirectory1 { get; } = Environment.CurrentDirectory;
    public string CurrentDirectory2 => Environment.CurrentDirectory;
}

Here we have two properties: `CurrentDirectory1` and `CurrentDirectory2`. Both seem to return the same thing, the current directory. However, a closer look reveals a subtle difference.

Imagine if the current directory is `C:\Stuff` at class instantiation but gets changed to `C:\Windows` some time afterward; `CurrentDirectory1` will return `C:\Stuff`, but `CurrentDirectory2` will return `C:\Windows`. The reason for this difference is the syntax used. The first property uses auto-initialization; it captures the value of `Environment.CurrentDirectory` on construction and always returns that captured value, even if `Environment.CurrentDirectory` changes. The second property uses an expression-body; it will always return the current value of `Environment.CurrentDirectory`, not the value of `Environment.CurrentDirectory` on construction of the `MyClass` instance.

I am sure you can imagine more serious scenarios where such a mix-up could be a problem. Do you think this difference in behavior will be obvious enough during code review or when a bug is reported? I certainly don't and I'm writing this as a way of reinforcing it in my own mind. Perhaps you have already dealt with a bug relating to this; if so, share your tale of woe in the comments.

In Conclusion..

I am by no means intending to discourage the use of these two additions to the C# language; they are brilliant and you should definitely add them to your coding arsenal, but like many things in software development, there is a dark side. Understanding the pros and cons of any such feature is important as it enables us to spot errors, fix bugs, and write good tests. This new confusion in the C# world is just another encouragement to code clearly, test sensibly, and be aware of the power in the tools and languages we use.

No one else seems to by hyphenating "expression-bodied" but it doesn't make sense to me otherwise; what is a "bodied property"? [↩]
Yes, I know that `System.Enviroment.CurrentDirectory` isn't really expensive; this is for illustrative purposes [↩]
especially if you are keen on making sure your code expresses exactly what you mean [↩]
expression-bodied methods are also possible, but I'm not touching on that in this post [↩]