C#6: Using Static Types and Extension Methods

This week I thought I would continue from the last couple of posts on the new language features introduced in C#6 and look at the changes to the `using` keyword.

Up until the latest syntax, `using` was overloaded in three different ways1.

  1. To import types from a specific namespace, reducing the need to fully quality those types when referencing them in subsequent code.
    using System.Collections;
  2. To alias namespaces and types in order to resolve ambiguities when types share a name but different namespaces.
    using Drawing = System.Windows.Drawing; // Namespace alias
    using RectangleShape = System.Windows.Shapes.Rectangle; // Type alias
    
  3. To define a scope at the end of which an object will be disposed
    using (var stream = new MemoryStream())
    {
        // Stuff using the stream
    }

With C#6 comes an additional overload that allows us to import methods from within a specific static class. By specifying the `static` keyword after `using`, we can give the name of a static class containing the members we want to import. Doing this allows us to reference the methods as though they were members of our class.

Using Static

Consider `System.Math`; prior to this updated syntax, using the various methods on the `System.Math` class would require either specifying the fully qualifed type name, `System.Math` or, if `using System` were specified, just the type name, `Math`. Now, by specifying `using static System.Math` we can reference the methods of the `Math` class as though they were members of the class invoking them (without a `System.Math` or `Math` prefix). In this example, `Math.Abs()` is called as just `Abs()`.

using static System.Math;

public class MyClass
{
    public void DoStuff(int value)
    {
        var absoluteValue = Abs(value);
        Console.WriteLine(absoluteValue);
    }
}

As with other additions in C#6, this seems to be aimed at improving developer productivity as it leads to less overall typing when using the methods of a static class. However, the new `using static` syntax also allows for very targeted inclusion of static classes without the rest of their containing namespace, previously only possible with an alias, such as `using Math = System.Math`. This targeting ability, while not really adding anything for regular static methods, makes a significant difference for extension methods.

Extension Methods

As you probably know, extension methods are just fancy static methods, they can even be invoked as would a regular static method. However, extension methods can also be invoekd as though they where member methods of a variable or literal value. For example, both the following examples compile to the same code (assuming we have an enumerable called `list`).

list.Where(x <= 5);
Enumerable.Where(list, x <= 5);

However, before the `using static` syntax, including extension methods was a bit uncontrolled. If you wanted the extension methods in `System.Linq.Enumerable`, you had to include the entire `System.Linq` namespace; there was no way to include only the `Enumerable` static class. In some circumstances, this inability to include just the static class led to annoying type name clashes and occasionally unexpected overload resolution ambiguities or surprises. Now, with `using static` we can specify the exact class of extension methods we want to include and ignore the rest of the containing namespace.

With all that said, there is a notable difference between including regular methods of a static class and extension methods of a static class when importing via `using static <namespace>.<static class>`2.

Subtle Difference

When a static class is imported with `using static`, the way a method can be invoked depends on whether it is an extension method or not. For example, imagine we have a static class called `MyStaticClass` and it has a regular3 static method on it called `Print` that takes a `string`. When included via `using static`, `Print` could be used like this:

void Main()
{
    MyStaticClass.Print("this string");
    // or...
    Print("this string");
}

However, if instead `Print` were an extension method on type `string`, including `MyStaticClass` via `using static` would limit `Print` to being used like this:

void Main()
{
    MyStaticClass.Print("this string");
    // or...
    "this string".Print();
}

Note how in both examples , `Print` can be invoked as a traditional static method when the containing type is referenced, as in `MyStaticClass.Print()`, but their invocation varies when `using static` imports the class. In that second scenario, non-extension static methods are invoked as though they are methods on the current type, where as extension methods are invoked only as though they are methods on a variable. For the extension method version of `Print`, the following is not allowed:

Print("this string");

To use this argument-style syntax with an extension method, we must resort to the same syntax we would have used before `using static`, specifying the type name before the method:

MyStaticClass.Print("this string");

Though I feel it is clear and intuitive, this is a subtle difference worth understanding, as it can lead to breaking changes. Consider if you were refactoring the methods of a static class from extension method to regular static method or vice versa, and that class were imported somewhere with `using static`; any invocations that were not prefixed with the static class name would fail to compile.

In Conclusion

Overall, I like the new `using static` syntax; I believe the differences in method invocation from how static class methods are normally invoked makes sense and I hope you do too. Like all the other features of C#, there will be times to use this feature and times to let it go in favour of something clearer and more appropriate. For me, the ability to pluck a specific class and its extension methods from a namespace without importing the rest of that namespace is the most useful aspect of `using static` and probably what I will use most. How about you? Do you see yourself adding `using static` to your coding arsenal, or is it going to languish in your scrapbook of coding evil? Do tell.

  1. The MSDN docs say two different ways, but it was clearly three and is now three and a halfish []
  2. However, unlike the subtle difference I highlighted in my last post, thankfully, the compiler will catch this one []
  3. as in not an extension method []

LINQ: Deferred Execution

This is part of a short series on the basics of LINQ:

In the first rant post of this short series on LINQ, I explained the motivation behind writing this series in the first place, which can be summarised as:

People don't know LINQ and that impacts my ability to make use of it; I should try to fix that.

To start, I'm going to explain what I believe is the most important concept in LINQ; deferred execution.

So, what is deferred execution?

Deferred execution code is not executed until the result is needed; the execution is put off (deferred) until later. By doing this we can combine a series of actions without actually executing any of them, then execute them at the time we need a result. This allows us to limit the execution of computationally expensive operations until we absolutely need them.

That's my description, here is one from an MSDN tutorial on LINQ-to-XML that perhaps puts it more clearly:

Deferred execution means that the evaluation of an expression is delayed until its realized value is actually required. Deferred execution can greatly improve performance when you have to manipulate large data collections, especially in programs that contain a series of chained queries or manipulations. In the best case, deferred execution enables only a single iteration through the source collection.

Some may be surprised to know that deferred execution was not new when LINQ arrived, it had already been around for quite some time in the form of iterator methods. In fact, it is iterator methods that give LINQ its deferred execution. Before we look at an iterator method, let's look at an example of immediate execution. For this example, we will give ourselves the task of taking a collection of people and outputting a collection of unique last names for all those born before 1980.

public struct Person
{
    public Person(string first, string last, DateTimeOffset dateOfBirth) : this()
    {
        FirstName = first;
        LastName = last;
        DateOfBirth = dateOfBirth;
    }
	
    public string FirstName { get; private set; }
    public string LastName { get; private set; }
    public DateTimeOffset DateOfBirth { get; private set; }
}

public static class Data
{
    public static IEnumerable<Person> People = new[] {
        new Person("John", "Smith", DateTimeOffset.Now - TimeSpan.FromDays( 365 * 34 )),
        new Person("Bill", "Smith", DateTimeOffset.Now - TimeSpan.FromDays( 365 * 20 )),
        new Person("Sarah", "Allans", DateTimeOffset.Now - TimeSpan.FromDays( 365 * 19 )),
        new Person("John", "Johnson", DateTimeOffset.Now - TimeSpan.FromDays( 365 * 44 )),
        new Person("James", "Jones", DateTimeOffset.Now - TimeSpan.FromDays( 365 * 78 )),
        new Person("Alex", "Jones", DateTimeOffset.Now - TimeSpan.FromDays( 365 * 30 )),
        new Person("Mabel", "Thomas", DateTimeOffset.Now - TimeSpan.FromDays( 365 * 29 )),
        new Person("Sarah", "Brown", DateTimeOffset.Now - TimeSpan.FromDays( 365 * 23 )),
        new Person("Gareth", "Smythe", DateTimeOffset.Now - TimeSpan.FromDays( 365 * 100 )),
        new Person("Gregory", "Drake", DateTimeOffset.Now - TimeSpan.FromDays( 365 * 90 )),
        new Person("Michael", "Johnson", DateTimeOffset.Now - TimeSpan.FromDays( 365 * 56 )),
        new Person("Alex", "Smith", DateTimeOffset.Now - TimeSpan.FromDays( 365 * 22 )),
        new Person("William", "Pickwick", DateTimeOffset.Now - TimeSpan.FromDays( 365 * 17 )),
        new Person("Marcy", "Dickens", DateTimeOffset.Now - TimeSpan.FromDays( 365 * 18 )),
        new Person("Erica", "Waters", DateTimeOffset.Now - TimeSpan.FromDays( 365 * 26 ))
    };
}
IEnumerable<string> GetLastNamesImmediate()
{
    var distinctNamesToReturn = new List<string>();
    foreach (var person in Data.People)
    {
        if (person.DateOfBirth.Year < 1980 && !distinctNamesToReturn.Contains(person.LastName))
        {
            distinctNamesToReturn.Add(person.LastName);
        }
    }
    return distinctNamesToReturn;
}

When this method is called, it iterates over the entire collection and then returns its result, which also can then be iterated. If the collection of people were huge and we only cared about the first five names, this would be incredibly slow. To turn this into deferred execution, we can write it like this:

IEnumerable<string> GetLastNamesLazyDeferredWithoutLINQ()
{
    var lastNamesWeHaveSeen = new List<string>();
    foreach (var person in Data.People)
    {
        if (person.DateOfBirth.Year < 1980 && !lastNamesWeHaveSeen.Contains(person.LastName))
        {
            lastNamesWeHaveSeen.Add(person.LastName);
            yield return person.LastName;
        }
    }
}

Now, none of the code in this method is executed until the first time something calls `MoveNext()` on the returned enumerable. This means we could take the first five names without processing the entire collection of people, giving us a potentially enormous performance gain. If each item in the generated collection were computationally expensive to produce, without lazy evaluation, that expense would be multiplied by the total number of items in the collection on every call to the generator method; however, with lazy evaluation, the consumer of the collection gets to decide how many items are computed and therefore, how much work gets done. This ability to defer and control computationally expensive operations is the power of deferred execution.

However, not every deferred action necessarily has low overhead. Deferred execution actually comes in two flavours; eager evaluation and lazy evaluation (the example above is an example of lazy evaluation)1. Every action in a deferred execution chain uses either lazy or eager evaluation. Though lazy evaluation is preferred, sometimes it is not possible to evaluate one item at a time, such as when sorting. Eagerly evaluated deferred execution allows us to at least defer the effort until we want it done.

An eagerly evaluated version of the iterator method we have been looking at might look like this:

IEnumerable<string> GetLastNamesEagerDeferredWithoutLINQ()
{
    var distinctNamesToReturn = new List<string>();
    foreach (var person in Data.People)
    {
        if (person.DateOfBirth.Year < 1980 && !distinctNamesToReturn.Contains(person.LastName))
        {
            distinctNamesToReturn.Add(person.LastName);
        }
    }
	
    foreach (var name in distinctNamesToReturn)
    {
        yield return name;
    }
}

In this example, because it is still an iterator method (it returns its results using `yield return`), none of the code is executed until the very first time the `MoveNext()` method is called on the returned enumerable, and therefore, the execution is deferred. When `MoveNext()` gets called for the first time, the entire collection of data is processed at once and then the results are output one by one as needed. The difference between this and the immediate execution equivalent we first looked at is that in this version, no work is done until a result is demanded.

Allowing the consumer of a collection to control how much work is done rather than work being dictated by the collection generator allows us to manage data more efficiently by building chains of operations and then processing the result in one go when needed. Lazy evaluation gives us the additional ability to spread the effort across each call to `MoveNext()`. The key to writing good LINQ is understanding which actions are immediate, which are deferred and lazily evaluated, which are deferred and eagerly evaluated, and why it matters. We will take a look at that next time.

  1. Quite often, people use 'deferred execution' and 'lazy evaluation' interchangeably, but they are not actually synonymous, nor is 'immediate execution' synonymous with 'eager evaluation'. []

Unit testing attribute driven late-binding

I've been working on a RESTful API using ASP WebAPI. It has been a great experience so far. Behind the API is a custom framework that involves some late-binding. I decorate certain types with an attribute that associates the decorated type with another type1. The class orchestrating the late-binding takes a collection of IDecorated instances. It uses reflection to look at their attributes to determine the type they are decorated with and then instantiates that type.

It's not terribly complicated. At least it wasn't until I tried to test it. As part of my development I have been using TDD, so I wanted unit tests for my late-binding code, but I soon hit a hurdle. In mocking IDecorated, how do I make sure the mocked concrete type has the appropriate attribute?

var mockedObject = new Mock();

// TODO: Add attribute

binder.DoSpecialThing( mockedObject.Object ).Should().BeAwesome();

I am using Moq for my mocking framework accompanied by FluentAssertions for my asserts2. Up until this point, Moq seemed to have everything covered, yet try as I might I couldn't resolve this problem of decorating the generated type. After some searching around I eventually found a helpful Stack Overflow question and answer that directed me to TypeDescriptor.AddAttribute, a .NET-framework method that provides one with the means to add attributes at run-time!

var mockedObject = new Mock();

TypeDescriptor.AddAttribute(
    mockedObject.Object.GetType(),
    new MyDecoratorAttribute( typeof(SuperCoolThing) );

binder.DoSpecialThing( new [] { mockedObject.Object } )
    .Should()
    .BeAwesome();

Yes! Runtime modification of type decoration. Brilliant.

So, why didn't it work?

My binding class that I was testing looked a little like this:

public IEnumerable<Blah> DoSpecialThing( IEnumerable<IDecorated> decoratedThings )
{
    return from thing in decoratedThings
           let converter = GetBlahConverter( d.GetType() )
           where d != null
           select converter.Convert( d );
}

private IConverter GetBlahConverter( Type type )
{
    var blahConverterAttribute = Attribute
        .GetCustomAttributes( type, true )
        .Cast<BlahConverterAttribute>()
        .FirstOrDefault();

    if ( blahConverterAttribute != null )
    {
        return blahConverterAttribute.ConverterType;
    }

    return null;
}

Looks fine, right? Yet when I ran it in the debugger and took a look, the result of GetCustomAttributes was an empty array. I was stumped.

After more time trying different things that didn't work than I'd care to admit, I returned to the StackOverflow question and started reading the comments; why was the answer accepted answer when it clearly didn't work? Lurking in the comments was the missing detail; if you use TypeDescriptor.AddAttributes to modify the attributes then you have to use TypeDescriptor.GetAttributes to retrieve them.

I promptly refactored my code with this detail in mind.

public IEnumerable<Blah> DoSpecialThing( IEnumerable<IDecorated> decoratedThings )
{
    return from thing in decoratedThings
           let converter = GetBlahConverter( d.GetType() )
           where d != null
           select converter.Convert( d );
}

private IConverter GetBlahConverter( Type type )
{
    var blahConverterAttribute = TypeDescriptor
        .GetAttributes( type )
        .OfType<BlahConverterAttribute>()
        .FirstOrDefault();

    if ( blahConverterAttribute != null )
    {
        return blahConverterAttribute.ConverterType;
    }

    return null;
}

Voila! My test passed and my code worked. This was one of those things that had me stumped for longer than it should have. I am sharing it in the hopes of making sure there are more hits when someone else goes Internet fishing for help. Now, I'm off to update Stack Overflow so that this is clearer there too.

  1. Something similar to TypeConverterAttribute usage in the BCL []
  2. Though I totally made up the BeAwesome() assertion in this blog post []