C#6: Collection Initializers

Patterns and Collection Initializers

Some of the cool parts of C# are pattern-based, rather than type-based as one might expect. For example, foreach does not need the enumerated type to implement IEnumerable in order to work, it just requires that it has a GetEnumerator() method. Another place where pattern-based compilation occurs that also happens to illustrate how useful this pattern-based approach can be is in collection initializers like this:

When this gets compiled, for each value in the initializer the C# compiler1 looks for an Add() method on the collection type with an appropriate number of arguments of the appropriate types, which it then calls for that value. The benefit to using a pattern-based approach is that the compiler does not need to know about every possible compatible type up front or what Add() methods it might support. It only enforces that the type derives from IEnumerable and that it has an Add() method that matches the initializer values. This allows us to create a collection types that can support a variety of different ways to add values without needing the compiler to know our type will ever exist. For example, we could create a collection of names with Add() methods that take one or two strings and then initialize elements with either just the surname or first name and surname2.

Collection initializers in C#6

In C#6, a new collection initializer syntax has been added and the way the compiler interprets the existing syntax has been modified. Before we look at the newly added syntax, let us look at how the compilation of the existing syntax has been changed. To do so, consider a collection of DateTimeOffset values where we want to simplify adding dates and times from parsable string values. To support this we could implement an entire new type with the appropriate calls or we could derive from an existing collection type List<DateTimeOffset> and then implement a new Add() method to support string.

Of course, not all collections are open for extension and creating new types for this is cumbersome since we want a list of DateTimeOffset we just happen to want to initialize it from another type. To get around sealed types and the need to implement wrapper types or derivations, VB.NET has supported using extension methods to expand the Add() options on a type. I like this idea since, in the previous example, our list is really still of DateTimeOffset and we want others to see it that way, we just happen to support adding string values; why should we be forced to use a different type for that? Alas ((Cue Top Gear voice style)), this feature was not included in C#…until now. As of C#6, this disparity between VB.NET and C# is no more; the compiler will use a matching Add() extension method in lieu of an appropriate Add() method on the type itself.

Interestingly, this change to how C# resolves overloaded methods is very specific in that it only supports Add() extension methods and not extension methods in other pattern-based scenarios like GetEnumerator. I am not certain why this so, since I can imagine some cases where enumerating an existing non-enumerated type might be quite nice3, though I expect is is because it would not be clear what was going to get enumerated and therefore, the code would be ambiguous and hard to follow4. The Add() method usage in an initializer does not have this ambiguity as the compiler makes it clear if it found a suitable Add method that matches both the collection type and the type of the element being added.

Index Initializers

The other change to collection initializers in C#6 is the introduction of index intializer syntax. This new syntax is similar to the existing collection initializer syntax we have discussed, except that instead of using Add() methods, it uses indexers. With index-based collection initialization we can specify values for specific indices in a collection. This works for any indexer that a collection implements. Traditionally, we might initialize a Dictionary<string,string> using the Add() method pattern like this:

But with the index initializer syntax, we can make it clear that one string indexes the other to make this much more readable as:

I cannot speak for anyone else, but I think this really makes the code easier to read. Note, however, that this new index syntax cannot be mixed with traditional initializer syntax; for example, the following is invalid:

I think it is okay that they cannot be mixed. One way is using Add() method overload resolution to set values and the other is using indexers; these use different semantics and often have different implementations and connotations. By mixing them, the code becomes muddled and loses meaning; are we specifying records in a collection or are we mapping specific indexes to their records?

In Conclusion

Both of these changes to collection initialization are reasonably subtle. Of all the features C#6 brings us, these are perhaps going to be used the least. In fact, when I started writing this post I was unsure of their value. However, as I wrote and thought of usage examples, I came to the realisation that although they cater to perhaps infrequent scenarios, these changes to collection initializers each provide nice additions to the C# language. Index initializers remove a little ambiguity from the initialization of indexed collections, such as dictionaries, whereas the expansion of Add() method overload resolution to include extension methods reduces the number of frivolous types we have to create. In short, they allow us to write simpler, clearer code, and that is a beautiful thing.


  1. pre-C#6 

  2. A contrived example to be sure, but illustrative none-the-less 

  3. Such as enumerating the lines from a file stream 

  4. Much clearer to write a LineEnumerator wrapper for FileStream and use it explicitly 

C#6: String Interpolation

Continuing the trend of my recent posts looking at the new features of C#6, this week I want to look at string interpolation.

Prior to C#6, string interpolation (or string formatting) was primarily the domain of the .NET framework and calls like string.Format() and StringBuilder.AppendFormat()1 as in the following example:

With string interpolation in C#6, this can be written as:

This is a little easier to read while also reducing what has to be typed to achieve the desired result. The contents of the braces are evaluated as strings and inserted into the resultant string. Under the hood, this example compiles down to the same string.Format call that was made in the earlier example. The same composite formatting power is there to specify things like significant figures and leading zeroes. If you need a culture-invariant string, there is a handy static method in the new System.FormattableString class called Invariant(). If you wrap your string with this Invariant() method, you will get the string formatted against the invariant culture.

Magic

Of course, to end the story there without discussing the compiler magic would do a disservice to this new feature. In the example above, the result of the interpolation was stored in a variable with type var. This means the type is inferred by the compiler, which infers string and then performs appropriate compiler operations to turn our interpolated string into a call to string.Format(). This means that we don't have to do anything else to use this feature and get formatted strings. However, we can make the compiler do something different by rewriting the line like this2:

We have now specified that we are using a variable of type FormattableString. With this declaration, the compiler changes its behavior and we get a FormattedString object that represents the interpolated string. From this object, we can get the Format string that could be passed to a call that takes a format string, such as string.Format() (there are several others in types like Console, StringBuilder, and TextWriter). We can also retrieve the number of arguments3 in the string using ArgumentCount, and use GetArgument() and GetArguments() to retrieve the values of those arguments. Using a combination of Format and GetArguments(), we can pass this information to a different call that might reuse or extend it to produce a different message. Finally, we can use the ToString() call to specify an IFormatProvider, allowing us to format the string according to a specific culture.

By telling the compiler that we want a FormattableString we get all this extra information to use as we see fit. If you look at the arguments using either of the Get.. methods, you will see that the values have already been evaluated, so you can be assured that they won't change as you process the string. I'm sure there are situations where you might find this additional access to the formatting invaluable, such as when creating compound error messages, or perhaps doing some automatic language translation.

In conclusion…

There's not much else for me to say about C#6's string interpolation except to highlight one gotcha that I have hit a couple of times. The next two examples should illustrate appropriately:

Here is what these two examples will output:

It's hard to argue with either of them, after all, they both wrote an interpretation of DateTime.Now to the console, but the first one is perhaps a more useful output4.

So why did the second example not work? You may have already spotted the answer to that question, especially if you're a VB programmer; it's the $ at the start of the first example's string.  This $ tells the compiler that we are providing a string for interpolation. It's an easy thing to miss and if you forget it (or perhaps, in rare cases, add it erroneously) you'll likely only spot the mistake through thorough testing5 or customer diligence6. As always, learn the failure points and work to mitigate them with code reviews and tests. I suspect the easiest mitigation may be to always use the interpolation style strings unless a situation demands otherwise.

And that's it for this week. What do you think of the new string interpolation support? Will you start using it? If not, why not? Do you have any cool ideas for leveraging the additional information provided by FormattableString? Please share in the comments.

If you're interested in my other posts on some of the new things introduced by C#6, here are links to posts I have written thus far:


  1. The + operator can be used in conjunction with ToString() but it can get messy to read and is very hard to localize 

  2. We could also cast the interpolated string to FormattableString and leave the variable as var

  3. Each inserted value is an argument 

  4. Except when providing examples in a blog 

  5. Unit tests or otherwise 

  6. Write automated tests and test manually; let's not use customers as QA