C#6: Collection Initializers

Patterns and Collection Initializers

Some of the cool parts of C# are pattern-based, rather than type-based as one might expect. For example, `foreach` does not need the enumerated type to implement `IEnumerable` in order to work, it just requires that it has a `GetEnumerator()` method. Another place where pattern-based compilation occurs that also happens to illustrate how useful this pattern-based approach can be is in collection initializers like this:

var list = new List<int> { 1, 2, 3, 4, 5, 6 };

When this gets compiled, for each value in the initializer the C# compiler1 looks for an `Add()` method on the collection type with an appropriate number of arguments of the appropriate types, which it then calls for that value. The benefit to using a pattern-based approach is that the compiler does not need to know about every possible compatible type up front or what `Add()` methods it might support. It only enforces that the type derives from `IEnumerable` and that it has an `Add()` method that matches the initializer values. This allows us to create a collection types that can support a variety of different ways to add values without needing the compiler to know our type will ever exist. For example, we could create a collection of names with `Add()` methods that take one or two strings and then initialize elements with either just the surname or first name and surname2.

var names = new NamesCollection
{
   "Jones",
   { "David", "Smith" },
   { "Daniel", "Smith" },
   "Smith"
};

Collection initializers in C#6

In C#6, a new collection initializer syntax has been added and the way the compiler interprets the existing syntax has been modified. Before we look at the newly added syntax, let us look at how the compilation of the existing syntax has been changed. To do so, consider a collection of `DateTimeOffset` values where we want to simplify adding dates and times from parsable string values. To support this we could implement an entire new type with the appropriate calls or we could derive from an existing collection type `List<DateTimeOffset>` and then implement a new `Add()` method to support `string`.

public class DateTimeOffsetList : List<DateTimeOffset>
{
    public void Add(string value)
    {
        base.Add(DateTimeOffset.Parse(value));
    }
}

Of course, not all collections are open for extension and creating new types for this is cumbersome since we want a list of `DateTimeOffset` we just happen to want to initialize it from another type. To get around sealed types and the need to implement wrapper types or derivations, VB.NET has supported using extension methods to expand the `Add()` options on a type. I like this idea since, in the previous example, our list is really still of `DateTimeOffset` and we want others to see it that way, we just happen to support adding `string` values; why should we be forced to use a different type for that? Alas ((Cue Top Gear voice style)), this feature was not included in C#…until now. As of C#6, this disparity between VB.NET and C# is no more; the compiler will use a matching `Add()` extension method in lieu of an appropriate `Add()` method on the type itself.

public static class Extensions
{
    public static void Add(this List<DateTimeOffset> list, string value)
    {
        list.Add(DateTimeOffset.Parse(value));
    }
}

Interestingly, this change to how C# resolves overloaded methods is very specific in that it only supports `Add()` extension methods and not extension methods in other pattern-based scenarios like `GetEnumerator`. I am not certain why this so, since I can imagine some cases where enumerating an existing non-enumerated type might be quite nice3, though I expect is is because it would not be clear what was going to get enumerated and therefore, the code would be ambiguous and hard to follow4. The `Add()` method usage in an initializer does not have this ambiguity as the compiler makes it clear if it found a suitable `Add` method that matches both the collection type and the type of the element being added.

Index Initializers

The other change to collection initializers in C#6 is the introduction of index intializer syntax. This new syntax is similar to the existing collection initializer syntax we have discussed, except that instead of using `Add()` methods, it uses indexers. With index-based collection initialization we can specify values for specific indices in a collection. This works for any indexer that a collection implements. Traditionally, we might initialize a `Dictionary<string,string>` using the `Add()` method pattern like this:

var pairs = new Dictionary<string, string>
{
    {"address","12 This Way, Anywhere"},
    {"name", "Bob Foo"},
    {"title", "Master of the universe"}
};

But with the index initializer syntax, we can make it clear that one string indexes the other to make this much more readable as:

var pairs = new Dictionary<string, string>
{
    ["address"] = "12 This Way, Anywhere",
    ["name"] = "Bob Foo",
    ["title"] = "Master of the universe"
};

I cannot speak for anyone else, but I think this really makes the code easier to read. Note, however, that this new index syntax cannot be mixed with traditional initializer syntax; for example, the following is invalid:

var pairs = new Dictionary<string, string>
{
    {"address","12 This Way, Anywhere"},
    ["name"] = "Bob Foo",
    ["title"] = "Master of the universe"
};

I think it is okay that they cannot be mixed. One way is using `Add()` method overload resolution to set values and the other is using indexers; these use different semantics and often have different implementations and connotations. By mixing them, the code becomes muddled and loses meaning; are we specifying records in a collection or are we mapping specific indexes to their records?

In Conclusion

Both of these changes to collection initialization are reasonably subtle. Of all the features C#6 brings us, these are perhaps going to be used the least. In fact, when I started writing this post I was unsure of their value. However, as I wrote and thought of usage examples, I came to the realisation that although they cater to perhaps infrequent scenarios, these changes to collection initializers each provide nice additions to the C# language. Index initializers remove a little ambiguity from the initialization of indexed collections, such as dictionaries, whereas the expansion of `Add()` method overload resolution to include extension methods reduces the number of frivolous types we have to create. In short, they allow us to write simpler, clearer code, and that is a beautiful thing.

  1. pre-C#6 []
  2. A contrived example to be sure, but illustrative none-the-less []
  3. Such as enumerating the lines from a file stream []
  4. Much clearer to write a `LineEnumerator` wrapper for `FileStream` and use it explicitly []