Jeff – Page 7 – somewhat abstract

C#7: Pattern Matching

So far in this series on C#7, we have looked at some nice new things, including out variables, new expression-bodied members, throw expressions, and binary numeric literals. These are all great little additions, but this week, we get to a truly cool and long-awaited feature; pattern matching. But, before we get into that, here is the usual summary of what I am covering.

Pattern Matching

I will admit that I was a tad confused as to why this is called "pattern matching". I read the Wikipedia article¹. I expect I would understand this more if I were a Computer Science major instead of Computer Systems Engineering². According to Wikipedia:

In computer science, pattern matching is the act of checking a given sequence of tokens for the presence of the constituents of some pattern.

The What's New in C#7 documentation explains:

Pattern matching is a feature that allows you to implement method dispatch on properties other than the type of an object.

While I now understand the academic concept of pattern matching, I think we may regret using the term to specifically reference the features that are under its umbrella for C#7. I would have preferred it if these features had been introduced using already widely understood and consistent nomenclature. I think the official documentation agrees with me, since it almost immediately splits the feature into two pieces; is expressions and switch expressions. However, I am going to draw the line that divides the parts of pattern matching in a different place and call the two parts cast-conditional variables and case filters, because that fits better with my understanding of what they do.

Cast-conditional Variables

Many of us know the following code:

var x = y as MyType;
if (x != null)
{
   //Do the awesome!
}

This is the efficient way of verifying a value is a given type and then consuming it as that type since we cast just once³. However, with the new cast-conditional feature, we can condense this down to:

if (y is MyType x)
{
    //Do the awesome!
}

The cast and the check of its success have been condensed into a single, easily understood statement. We can use this cast-conditional variable syntax in any place where we might otherwise but a boolean expression:

// Example 1: Calculating a boolean
bool isDoubleNan = value is double y && Double.IsNaN(y);

// Example 2: Ternary operator
string filePath = size is int x && x > 10 ? "C:\Archive" : "C:\TinyFiles";

This feels like a great improvement to me and fits really well with the similar out variable feature, however it is only a new way of writing something we could already do. When extended to the cases in switch statements (albeit with a slightly different syntax), this gives us a brand new ability; switching based on variable type:

switch (myInterface)
{
case MyTypeA a:
    // Do something because we know this is of type, MyTypeA
    break;

case MyTypeB b:
    // Do something because we know this is of type, MyTypeB
    break;
}

Now you can iterate over that array of object values and have a nice, succinct way of processing the contents by type. Not only that, but with case filters, you can craft even finer conditions for your switches.

Case Filters

C#6 introduced the when keyword as a modifier to catch expressions so that we could finally utilize exception filters from C#. Now, the when keyword gets a similar job in switch statements as a modifier to case statements. Like in exception filters, the when expression is a condition that must be met for the case to be a match. For example, we could create an IsNumber method and use when to filter cases like Infinity and NaN:

bool IsNumber(object value)
{
    switch (value)
    {
    case int x:
    case float y when !float.IsNaN(y) && !float.IsInfinity(y):
    case double z when !double.IsNaN(z) && !double.IsInfinity(z):
        return true;

    default:
        return false;
    }
}

Prior to C#7, this code would look something like this:

bool IsNumberOld(object value)
{
    int? x = value as int?;
    if (x.HasValue) return true;
    
    float? y = value as float?;
    if (y.HasValue && !float.IsNaN(y.Value) && !float.IsInfinity(y.Value)) return true;

    double? z = value as double?;
    if (z.HasValue && !double.IsNaN(z.Value) && !double.IsInfinity(z.Value)) return true;

    return false;
}

Not only was there more typing before C#7, but I think the code was more repetitive and harder to scan. This may be a convoluted example, but I hope it illustrates how valuable this new language syntax can be.

In Conclusion

Naming disagreements aside, the new pattern matching in C#7 is powerful. With that power comes responsibility; the responsibility to use it wisely and to call out others who do not, for there is surely great room for abuse with this feature. I envisage frequent and appropriate use of cast-conditional variables in if statements since the scenarios to which that caters are widespread. However, the filtering added to switch statements brings something entirely new and so, I do not see it being as widely adopted not as appropriately used; time will tell.

Overall, I love this addition to C#7 even though I do not like the name. What do you think? Does "pattern matching" make sense or should it be something like "cast-conditional variables" and "case filters" instead? Will you use this feature a lot? When might you find pattern matching useful? Sound off in the comments and discuss.

that's a lie; I read parts of it until I came to the conclusion that it was not helping [↩]
Is this a theory versus practice issue again? I face those often in this field [↩]
Using the is condition and then casting inside the if statement would cause two casts; one for the is and another for the cast [↩]

C#7: Throw Expressions and More Expression-bodied Members

In this installment of my look at C#7, we will take a look at some nice syntactical enhancements, including the first ever community contribution to the C# language implementation. Before we get started, here is a summary of what I am covering in this series on C#7.

Throw Expressions

We have all written code like this¹:

public class MyApiType
{
    private object _loadedResource;
    private object _someProperty;

    public MyApiType()
    {
        _loadedResource = LoadResource();
        if (_loadedResource == null) throw new InvalidOperationException();
    }

    public object SomeProperty
    {
        get
        {
            return _someProperty;
        }
    
        set
        {
            if (value == null) throw new ArgumentNullException();
            _someProperty = value;
        }
    }
}

I have omitted the exception arguments for brevity, but you should hopefully recognise the sort of sanity checking to which I am referring within the highlighted lines.

With throw expressions, we can now combine assignment, the null-coalescing operator², and throw to create succinct validation code. This means that the example above can be simplified to not even need the constructor.

public class MyApiType
{
    private object _loadedResource = LoadResource() ?? throw new InvalidOperationException();
    private object _someProperty;

    public object SomeProperty
    {
        get
        {
            return _someProperty;
        }
    
        set
        {
            _someProperty = value ?? throw new ArgumentNullException();
        }
    }
}

The highlighted lines are equivalent to the code we had earlier, but now we are able to use throw as part of the expression. The introduction of throw expressions means that we can now throw exceptions in conditional and null-coalescing expressions, as well as some lambda methods where it was previously not possible to do so. Not only that, but when combined with expression-bodied members, we can write some very expressive yet terse code.

Expression-bodied Members

With C#6 we got expression-bodied members, which allowed us to express simple methods using lambda-like syntax. However, this new syntax was limited to methods and read-only properties. Via the first ever community contribution to C#³, C#7 expands this syntax to cover constructors, finalizers, and property accessors.

If we take the property example we had before, containing our throw expression as part of property set accessor, we can now write it as:

public object SomeProperty
{
    get => _someProperty;
    set => _someProperty = value ?? throw new ArgumentNullException();
}

I won't bother with examples for constructors or finalizers; the main documentation is pretty clear on those and I am not convinced the syntax will be used very often in those cases. Constructors are rarely so simple that the expression-bodied syntax makes sense, and finalizers are so rarely needed⁴ that most of us will not get an opportunity to write one at all, expression-bodied or otherwise.

In Conclusion

These simple additions to the C# syntax enable us to write terse code without losing clarity, which is always a good thing. Not only that, but we have reached a landmark event; community contributions to C#. This contribution may be a little tame when compared with some of the other features coming in C#7, but it bodes well for the future of the language in its new, open source home.

Next time, we will take a look at the highly anticipated pattern matching. Until then, feel free to leave a comment, or read more about C#7 on my blog and on the official documentation.

Let's ignore the nastiness of throwing exceptions during construction [↩]
You remember Elvis, right?? [↩]
Source: https://docs.microsoft.com/en-us/dotnet/articles/csharp/csharp-7#more-expression-bodied-members [↩]
If you find yourself writing a finalizer, I recommend you make sure you really need it; there is probably a better way [↩]

C#7: Out Variables

Last time, we started to look at the new features introduced in C#7. Here is a quick refresher of just what those features are:

In this post, we will look at one of the simplest additions to the C# language; out variables.

int dummy

How often have you written code like this?

int dummy;

if (int.TryParse(someString, out dummy) && dummy > 0)
{
   // Do something
}

Or this?

double dummy;

if (myDictionary.TryGetValue(key, out dummy))
{
   //Do something
}

Sometimes you use the out value retrieved, sometimes you do not, often you only use it within the scope of the condition. In any case, there is always the variable definition awkwardly hanging out on its own line, looking more important than it really is and leaving space for it to accidentally get used before it has been initialized. Thankfully, C#7 helps us tidy things up by allowing us to combine the variable definition with the argument.

Using the out variable syntax, we can write this:

if (int.TryParse(someString, out int dummy) && dummy > 0)
{
    //Do something
}

In fact, we do not even need to declare the type of the variable explicitly. While often we want to be explicit to make it clear that it matters (and to ensure we get some compile time checking of our assumptions), we can use an implicitly typed variable like this:

if (myDictionary.TryGetValue(someKey, out var dummy))
{
    //Do something
}

In Conclusion

out variables are part of a wider set of features for reducing repetition (in written code and in run-time execution), and saying more with less (i.e. making it easier for us to infer intent from the code without additional commentary). This is a very simply addition to C# syntax, yet useful. Not only does it reduce what we need to type, it also improves code clarity (in my opinion), and reduces the possibility of silly errors like using a variable before it has been initialized, or worse, thinking that it being uninitialized was a mistake and hiding a bug by initializing it.

Until next time, if you would like to tinker with any of the C#7 features I have been covering, I recommend getting the latest LINQPad beta or Visual Studio 2017 RC.

C#7: Binary Literals and Numeric Literal Digit Separators

Happy New Year, y'all! I thought I would kick off 2017 with a look at C#7. The next release of Visual Studio will soon be upon us and with it a new version of C#. As with its predecessor, C#6, C#7 brings a variety of syntactical and compiler magic allowing us to do more work with less code. Just as the new features of C#6 enabled us to make code more readable by reducing ceremony and making intent clearer¹, so go the new features of C#7.

Before we take a look closer look, here is an overview of the goodies in C#7:

It is a shorter list than the new features for C#6, but there is still a lot of goodness crammed in there. Over the next few posts, I want to delve into these features just a literal to familiarize myself (and you) with them and how they may impact the code we write. So, without further ado, let's take a look at the first two items on the list; binary literals and numeric literal digit separators.

Binary Literals

Numeric literals are not a new concept in C#. We have been able to define integer values in base-10 and base-16 since C# was first released. Common uses case for base-16 (also known as hexadecimal) literals are to define flags and bit masks in enumerations and constants. Since each digit in a base-16 number is 4 bits wide, each bit in that digit is represented by 1, 2, 4, and 8.

[Flags]
public enum Option
{
    None    = 0x00,
    Option1 = 0x01,
    Option2 = 0x02,
    Option3 = 0x04,
    Option4 = 0x08,
    Option5 = 0x10,
    Option6 = 0x20,
    Option7 = 0x40,
    Option8 = 0x80,
    All     = 0xFF
}

While this is familiar to most, using C#7 we can now express such things explicitly in base-2, more commonly referred to as binary. While hexadecimal literals are prefixed with 0x , binary literals are prefixed with 0b.

[Flags]
public enum Option
{
    None    = 0b00000000,
    Option1 = 0b00000001,
    Option2 = 0b00000010,
    Option3 = 0b00000100,
    Option4 = 0b00001000,
    Option5 = 0b00010000,
    Option6 = 0b00100000,
    Option7 = 0b01000000,
    Option8 = 0b10000000,
    All     = 0b11111111
}

Although I am used to using base-16 numbers for this, I can see value in being explicit by using binary literals. The strength comes when more than one bit is set. When using base-16, it can be easy to make a mistake and it is not immediately obvious what bits are set by a specific value². With binary literals, it is immediately obvious without additional, potentially erroneous side calculations.

Digit Separators

Of course, binary values can get big fast and keeping track of which things line up with which can be fraught with problems. Sure, we can try to line up the values, but what if the indentation gets one space off? Will we really notice during that code review?

To help with readability like this and to assist in avoiding silly off-by-one issues that can arise due to misaligned values, C#7 introduces _ as a digit separator for all numeric literals. This separator is stripped out by the compiler; it is just syntactical candy to aid readability and serves no purpose within the compiled code. For example, our enumeration above that uses binary literals can be rewritten as follows:

[Flags]
public enum Option
{
    None    = 0b0000_0000,
    Option1 = 0b0000_0001,
    Option2 = 0b0000_0010,
    Option3 = 0b0000_0100,
    Option4 = 0b0000_1000,
    Option5 = 0b0001_0000,
    Option6 = 0b0010_0000,
    Option7 = 0b0100_0000,
    Option8 = 0b1000_0000,
    All     = 0b1111_1111
}

I think this really does help with readability although I was disappointed to find that I could not use this separator directly after the base modifier. I do not know about anyone else, but it seems more readable to separate the modifier from the actual value. Thankfully, we can pad the left of our number with zeroes as long as the value we define fits into the type we are assigning.

byte a = 0b_0000_0001;  //INVALID: Digit separator cannot be at the start or end of the value
byte b = 0b1_0000_0001; //INVALID: 257 doesn't fit in a byte
byte c = 0b0_0000_0001; //VALID: 1 fits into a byte just fine

I suspect we may start seeing code that uses this "padding plus separator" approach once C#7 gets wider acceptance as I think it really improves readability; 0b0_0001_0000 is clearer to me than 0b0001_0000.

In addition, the digit separator is not limited to just binary numeric literals; it can be used in any numeric literal. For example, use it to separate 32-bit parts of a large hexadecimal number, or as a thousands separator in a floating point value; anywhere that it improves readability.

In Conclusion

The new binary literal syntax and digit separator should help to make intent clearer and code easier to read when used appropriately. As with any language feature, we must always use our best judgement to ensure it is being used appropriately. For more information on the features covered in this post, see the official documentation where you can also discover other C#7 magic that I will be covering in my upcoming posts.

Things like read-only auto-properties, expression-bodied member functions, exception filters, null-conditional operators, and the nameof operator to name a few [↩]
I know some can see in hex, and that's great, but not everyone is so adept [↩]

Running XUnit Tests, Using Traits, and Leveraging Parallelism

We have arrived at the end of this little series on migrating unit tests from MSTest to XUnit (specifically, XUnit 2). While earlier posts concentrated on writing the tests and the XUnit counterparts to MSTest concepts, this post will briefly look at some non-code aspects to XUnit; most importantly, we will look at getting the tests to run.

Do not depend on test order
One thing to watch out for after migrating your tests is the order in which tests are run. MSTest allowed us to abuse testing by assuming that tests would run in a specific order. XUnit does away with this and will run tests in a random order. This can help to find some obscure bugs, but it can also make migration a little tougher, especially when tests share a data store or some other test fixture. Watch out for that.

Visual Studio

Running tests inside Visual Studio is really simple. Following in the footsteps of web development trends, rather than requiring an extension to the development environment, XUnit uses package management¹. All you need to do is add the Visual Studio XUnit test runner package to your project and Visual Studio will be able to detect and run your XUnit tests just like your old MSTests.

Of course, if you're like me and absolutely loathe the built-in Visual Studio test explorer, you can use Resharper (or dotCover), which has built-in support for XUnit.

Command Line

More often than not, our continuous integration setups are scripted and we're unlikely to be running our unit tests via the development tool, such as Visual Studio. For situations like this, you can add the XUnit command line test runner package to your project. This provides a command line utility for running your tests with arguments to control exactly what tests and how².

Once the package has been added, you can browse to where Nuget is storing packages for your particular project. In the tools folder of the console runner package folder, you will find xunit.console.exe. If you run this with the -? argument, you will get some helpful information.

xunit.console.exe -?

The three options I use the most are -trait, -notrait, and -parallel.

Traits

The two trait options control what tests you are running based on metadata attached to those tests. This is really useful if you have some tests that are resource heavy and only used in certain circumstances, such as stress testing. In MSTest, you could attach arbitrary metadata onto test methods using the TestProperty attribute. XUnit provides a similar feature using Trait. For example, [Trait("Category", "ManualOnly")] could be used to put a method into the ManualOnly category. You could then use the following line to execute tests that lack this trait.

xunit.console.exe -notrait "Category=ManualOnly"

If you wanted to only run tests with a specific trait, you do the same thing but with -trait instead. Both of these options are very useful when combined with -parallel.

Parallelism

XUnit can run tests in parallel, but tests within the same collection are never run in parallel with each other. By default, each test class is its own collection. Test classes can be combined into collections using the Collection attribute. Tests within the same collection will be executed randomly, but never in parallel.

The -parallel option provides four options: no parallelism, running tests from different assemblies in parallel, running tests from different collections in parallel, or running tests from different assemblies and different collections in parallel.

The difference between assembly parallelism, collection parallelism, and both together

Let's assume you have two assemblies, A and B. Assembly A has three collections; 1, 2, and 3. Assembly B has three collections; 4, 5, and 6.

No Parallelism
-parallel none

XUnit will run each collection in one of the assemblies, one at a time, then run each collection in the other assembly one at a time.

Collections 1, 2, 3, 4, 5, and 6 will never execute at the same time as each other.

Parallel Assemblies
-parallel assemblies

XUnit will run each collection in assembly A, one at a time, at the same time as running each collection in assembly B, one at a time.

Collections 1, 2, and 3 will not execute at the same time as each other; Collections 4, 5, and 6 will not execute at the same time as each other; but collections 1, 2, and 3 will execute in parallel with 4, 5, and 6, as the 1, 2, and 3 are in a different assembly to 4, 5, and 6.

Parallel Collections
-parallel collections

XUnit will run each collection within an assembly in parallel, but only one assembly at a time.

Collections 1, 2 and 3 will execute parallel; collections 4, 5, and 6 will execute in parallel; but, 1, 2, and 3 will not run at the same time as 4, 5, and 6.

Parallel Everything
-parallel all

XUnit will run each collection in parallel, regardless of its assembly.

Collections 1, 2, 3, 4, 5, and 6 will run in parallel, each potentially running at the same time as any other.

Beware running tests in parallel when first migrating from MSTest. It is a surefire way of finding some heinous test fixture dependencies and you risk thinks like deadlocking on resources. Usually, running assemblies in parallel is a lot safer than running collections in parallel, assuming that tests are collocated in assemblies based on their purpose and the resources they interact with.

In Conclusion…

That brings us to the end of the series. I have focused primarily on migrating from MSTest, leaving out a lot of the nuances to XUnit. I highly recommend continuing your education with the XUnit documentation and through experimentation; having personally migrated several projects, I know you won't regret it.

I love this approach to augmenting the development environment. It requires no additional tooling setup to get your dev environment working. The source itself controls the tooling versions and installation. Wonderful [↩]
You can control how tests run under MSBuild too by using various properties. This is discussed more on the XUnit site [↩]