C#6: String Interpolation

Continuing the trend of my recent posts looking at the new features of C#6, this week I want to look at string interpolation.

Prior to C#6, string interpolation (or string formatting) was primarily the domain of the .NET framework and calls like `string.Format()` and `StringBuilder.AppendFormat()`1 as in the following example:

var aString = "AString";
var formatString = string.Format("The string, '{0}', has {1} characters.", aString, aString.Length);

With string interpolation in C#6, this can be written as:

var aString = "AString";
var formatString = $"The string, {aString}, has a {aString.Length} characters.";

This is a little easier to read while also reducing what has to be typed to achieve the desired result. The contents of the braces are evaluated as strings and inserted into the resultant string. Under the hood, this example compiles down to the same `string.Format` call that was made in the earlier example. The same composite formatting power is there to specify things like significant figures and leading zeroes. If you need a culture-invariant string, there is a handy static method in the new `System.FormattableString` class called `Invariant()`. If you wrap your string with this `Invariant()` method, you will get the string formatted against the invariant culture.

Magic

Of course, to end the story there without discussing the compiler magic would do a disservice to this new feature. In the example above, the result of the interpolation was stored in a variable with type `var`. This means the type is inferred by the compiler, which infers `string` and then performs appropriate compiler operations to turn our interpolated string into a call to `string.Format()`. This means that we don't have to do anything else to use this feature and get formatted strings. However, we can make the compiler do something different by rewriting the line like this2:

FormattableString formatString = $"The string, {aString}, has a {aString.Length} characters.";

We have now specified that we are using a variable of type `FormattableString`. With this declaration, the compiler changes its behavior and we get a `FormattedString` object that represents the interpolated string. From this object, we can get the `Format` string that could be passed to a call that takes a format string, such as `string.Format()` (there are several others in types like `Console`, `StringBuilder`, and `TextWriter`). We can also retrieve the number of arguments3 in the string using `ArgumentCount`, and use `GetArgument()` and `GetArguments()` to retrieve the values of those arguments. Using a combination of `Format` and `GetArguments()`, we can pass this information to a different call that might reuse or extend it to produce a different message. Finally, we can use the `ToString()` call to specify an `IFormatProvider`, allowing us to format the string according to a specific culture.

By telling the compiler that we want a `FormattableString` we get all this extra information to use as we see fit. If you look at the arguments using either of the `Get..` methods, you will see that the values have already been evaluated, so you can be assured that they won't change as you process the string. I'm sure there are situations where you might find this additional access to the formatting invaluable, such as when creating compound error messages, or perhaps doing some automatic language translation.

In conclusion…

There's not much else for me to say about C#6's string interpolation except to highlight one gotcha that I have hit a couple of times. The next two examples should illustrate appropriately:

Console.WriteLine($"{DateTime.Now}: I'm writing DateTime.Now to the console");
Console.WriteLine("{DateTime.Now}: I'm writing DateTime.Now to the console");

Here is what these two examples will output:

9/8/2015 11:11:43 AM: I'm writing DateTime.Now to the console.
{DateTime.Now}: I'm writing DateTime.Now to the console.

It's hard to argue with either of them, after all, they both wrote an interpretation of `DateTime.Now` to the console, but the first one is perhaps a more useful output4.

So why did the second example not work? You may have already spotted the answer to that question, especially if you're a VB programmer; it's the `$` at the start of the first example's string.  This `$` tells the compiler that we are providing a string for interpolation. It's an easy thing to miss and if you forget it (or perhaps, in rare cases, add it erroneously) you'll likely only spot the mistake through thorough testing5 or customer diligence6. As always, learn the failure points and work to mitigate them with code reviews and tests. I suspect the easiest mitigation may be to always use the interpolation style strings unless a situation demands otherwise.

And that's it for this week. What do you think of the new string interpolation support? Will you start using it? If not, why not? Do you have any cool ideas for leveraging the additional information provided by `FormattableString`? Please share in the comments.

If you're interested in my other posts on some of the new things introduced by C#6, here are links to posts I have written thus far:

  1. The `+` operator can be used in conjunction with `ToString()` but it can get messy to read and is very hard to localize []
  2. We could also cast the interpolated string to `FormattableString` and leave the variable as `var`. []
  3. Each inserted value is an argument []
  4. Except when providing examples in a blog []
  5. Unit tests or otherwise []
  6. Write automated tests and test manually; let's not use customers as QA []

C#6: Using Static Types and Extension Methods

This week I thought I would continue from the last couple of posts on the new language features introduced in C#6 and look at the changes to the `using` keyword.

Up until the latest syntax, `using` was overloaded in three different ways1.

  1. To import types from a specific namespace, reducing the need to fully quality those types when referencing them in subsequent code.
    using System.Collections;
  2. To alias namespaces and types in order to resolve ambiguities when types share a name but different namespaces.
    using Drawing = System.Windows.Drawing; // Namespace alias
    using RectangleShape = System.Windows.Shapes.Rectangle; // Type alias
    
  3. To define a scope at the end of which an object will be disposed
    using (var stream = new MemoryStream())
    {
        // Stuff using the stream
    }

With C#6 comes an additional overload that allows us to import methods from within a specific static class. By specifying the `static` keyword after `using`, we can give the name of a static class containing the members we want to import. Doing this allows us to reference the methods as though they were members of our class.

Using Static

Consider `System.Math`; prior to this updated syntax, using the various methods on the `System.Math` class would require either specifying the fully qualifed type name, `System.Math` or, if `using System` were specified, just the type name, `Math`. Now, by specifying `using static System.Math` we can reference the methods of the `Math` class as though they were members of the class invoking them (without a `System.Math` or `Math` prefix). In this example, `Math.Abs()` is called as just `Abs()`.

using static System.Math;

public class MyClass
{
    public void DoStuff(int value)
    {
        var absoluteValue = Abs(value);
        Console.WriteLine(absoluteValue);
    }
}

As with other additions in C#6, this seems to be aimed at improving developer productivity as it leads to less overall typing when using the methods of a static class. However, the new `using static` syntax also allows for very targeted inclusion of static classes without the rest of their containing namespace, previously only possible with an alias, such as `using Math = System.Math`. This targeting ability, while not really adding anything for regular static methods, makes a significant difference for extension methods.

Extension Methods

As you probably know, extension methods are just fancy static methods, they can even be invoked as would a regular static method. However, extension methods can also be invoekd as though they where member methods of a variable or literal value. For example, both the following examples compile to the same code (assuming we have an enumerable called `list`).

list.Where(x <= 5);
Enumerable.Where(list, x <= 5);

However, before the `using static` syntax, including extension methods was a bit uncontrolled. If you wanted the extension methods in `System.Linq.Enumerable`, you had to include the entire `System.Linq` namespace; there was no way to include only the `Enumerable` static class. In some circumstances, this inability to include just the static class led to annoying type name clashes and occasionally unexpected overload resolution ambiguities or surprises. Now, with `using static` we can specify the exact class of extension methods we want to include and ignore the rest of the containing namespace.

With all that said, there is a notable difference between including regular methods of a static class and extension methods of a static class when importing via `using static <namespace>.<static class>`2.

Subtle Difference

When a static class is imported with `using static`, the way a method can be invoked depends on whether it is an extension method or not. For example, imagine we have a static class called `MyStaticClass` and it has a regular3 static method on it called `Print` that takes a `string`. When included via `using static`, `Print` could be used like this:

void Main()
{
    MyStaticClass.Print("this string");
    // or...
    Print("this string");
}

However, if instead `Print` were an extension method on type `string`, including `MyStaticClass` via `using static` would limit `Print` to being used like this:

void Main()
{
    MyStaticClass.Print("this string");
    // or...
    "this string".Print();
}

Note how in both examples , `Print` can be invoked as a traditional static method when the containing type is referenced, as in `MyStaticClass.Print()`, but their invocation varies when `using static` imports the class. In that second scenario, non-extension static methods are invoked as though they are methods on the current type, where as extension methods are invoked only as though they are methods on a variable. For the extension method version of `Print`, the following is not allowed:

Print("this string");

To use this argument-style syntax with an extension method, we must resort to the same syntax we would have used before `using static`, specifying the type name before the method:

MyStaticClass.Print("this string");

Though I feel it is clear and intuitive, this is a subtle difference worth understanding, as it can lead to breaking changes. Consider if you were refactoring the methods of a static class from extension method to regular static method or vice versa, and that class were imported somewhere with `using static`; any invocations that were not prefixed with the static class name would fail to compile.

In Conclusion

Overall, I like the new `using static` syntax; I believe the differences in method invocation from how static class methods are normally invoked makes sense and I hope you do too. Like all the other features of C#, there will be times to use this feature and times to let it go in favour of something clearer and more appropriate. For me, the ability to pluck a specific class and its extension methods from a namespace without importing the rest of that namespace is the most useful aspect of `using static` and probably what I will use most. How about you? Do you see yourself adding `using static` to your coding arsenal, or is it going to languish in your scrapbook of coding evil? Do tell.

  1. The MSDN docs say two different ways, but it was clearly three and is now three and a halfish []
  2. However, unlike the subtle difference I highlighted in my last post, thankfully, the compiler will catch this one []
  3. as in not an extension method []

C#6: Auto-property Initializers and Expression-Bodied Properties

Last week, I discussed the new null-conditional operators added in C#6. This week, I would like to discuss two features that are awesome but could lead to some confusion: auto-property initializers and expression-bodied properties1.

Auto-initialized Properties

Before C#6, if we wanted to properly define an immutable property that had some expensive initialization, we had to do the following2:

public class MyClass
{
    public MyClass()
    {
        _immutableBackingField = System.Environment.CurrentDirectory;
    }

    public string ImmutableProperty
    {
        get
        {
            return _immutableBackingField;
        }
    }

    private readonly string _immutableBackingField;
}

Some people often use the shortcut of an auto-implemented property using the following syntax:

public class MyClass
{
    public MyClass()
    {
        ImmutableProperty = System.Environment.CurrentDirectory;
    }

    public string ImmutableProperty
    {
        get;
    }
}

However, defining properties like this means they are still mutable within the class (and its derivations). Using a backing field with the `readonly` keyword not only ensures that the property cannot be changed anywhere outside of the class construction, it also expresses exactly what you intended. Being as clear as possible is helpful for anyone who has to maintain the code in the future, including your future self.

From what I have read and heard, the main driver for using auto-implemented properties was writing less code. It somewhat saddens me when clarity of intent is replaced by speed of coding as we often pay for it later. Thankfully, both can now be achieved using initializers. Using this new feature, we can condense all that code down to just this:

class MyClass
{
    public int ImmutableProperty { get; } = System.Environment.CurrentDirectory;
}

It is a thing of beauty3. Behind the scenes, the compiler produces equivalent code to the first example with the `readonly` backing field.

Of course, this doesn't help much if you need to base your initialization on a value that is passed in via the constructor. Though a proposed feature for C#6, primary constructors, would have helped with this, it was pulled from the final release. Therefore, if you want to use construction parameters, you will still need a backing field of some kind. However, there is another feature that can help with this. That feature is expression-bodied properties4.

Expression-bodied Properties

An expression-bodied property looks like this:

class MyClass
{
    public int ImmutableProperty => 42;
}

This is equivalent to:

public class MyClass
{
    public int ImmutableProperty
    {
        get
        {
            return 42;
        }
    }
}

Using this lambda-esque syntax, we can provide more succinct implementations of our read-only properties. Consider this code:

public class MyClass
{
    public MyClass(string value)
    {
        _immutableBackingField = value;
    }

    public string ImmutableProperty
    {
        get
        {
            return _immutableBackingField;
        }
    }

    private readonly string _immutableBackingField;
}

Using expression-body syntax, we can write it as:

public class MyClass
{
    public MyClass(string value)
    {
        _immutableBackingField = value;
    }

    public string ImmutableProperty => _immutableBackingField;

    private readonly string _immutableBackingField;
}

But for the additional backing field declaration, this is almost as succinct as using an auto-implemented property. Hopefully, this new syntax will encourage people to make their intent clear rather than using the auto-implemented property shortcut when implementing immutable types.

Caveat Emptor

These new syntactical enhancements make property declaration not only easier to write, but in many common cases, easier to read. However, the similarities in these approaches can lead to some confusing, hard-to-spot bugs. Take this code as an example:

using System;

public class MyClass
{
    public string CurrentDirectory1 { get; } = Environment.CurrentDirectory;
    public string CurrentDirectory2 => Environment.CurrentDirectory;
}

Here we have two properties: `CurrentDirectory1` and `CurrentDirectory2`. Both seem to return the same thing, the current directory. However, a closer look reveals a subtle difference.

Imagine if the current directory is `C:\Stuff` at class instantiation but gets changed to `C:\Windows` some time afterward; `CurrentDirectory1` will return `C:\Stuff`, but `CurrentDirectory2` will return `C:\Windows`. The reason for this difference is the syntax used. The first property uses auto-initialization; it captures the value of `Environment.CurrentDirectory` on construction and always returns that captured value, even if `Environment.CurrentDirectory` changes. The second property uses an expression-body; it will always return the current value of `Environment.CurrentDirectory`, not the value of `Environment.CurrentDirectory` on construction of the `MyClass` instance.

I am sure you can imagine more serious scenarios where such a mix-up could be a problem. Do you think this difference in behavior will be obvious enough during code review or when a bug is reported? I certainly don't and I'm writing this as a way of reinforcing it in my own mind. Perhaps you have already dealt with a bug relating to this; if so, share your tale of woe in the comments.

In Conclusion..

I am by no means intending to discourage the use of these two additions to the C# language; they are brilliant and you should definitely add them to your coding arsenal, but like many things in software development, there is a dark side. Understanding the pros and cons of any such feature is important as it enables us to spot errors, fix bugs, and write good tests. This new confusion in the C# world is just another encouragement to code clearly, test sensibly, and be aware of the power in the tools and languages we use.

  1. No one else seems to by hyphenating "expression-bodied" but it doesn't make sense to me otherwise; what is a "bodied property"? []
  2. Yes, I know that `System.Enviroment.CurrentDirectory` isn't really expensive; this is for illustrative purposes []
  3. especially if you are keen on making sure your code expresses exactly what you mean []
  4. expression-bodied methods are also possible, but I'm not touching on that in this post []