C#7: Binary Literals and Numeric Literal Digit Separators

Happy New Year, y'all! I thought I would kick off 2017 with a look at C#7. The next release of Visual Studio will soon be upon us and with it a new version of C#. As with its predecessor, C#6, C#7 brings a variety of syntactical and compiler magic allowing us to do more work with less code. Just as the new features of C#6 enabled us to make code more readable by reducing ceremony and making intent clearer1, so go the new features of C#7.

Before we take a look closer look, here is an overview of the goodies in C#7:

It is a shorter list than the new features for C#6, but there is still a lot of goodness crammed in there. Over the next few posts, I want to delve into these features just a literal to familiarize myself (and you) with them and how they may impact the code we write. So, without further ado, let's take a look at the first two items on the list; binary literals and numeric literal digit separators.

Binary Literals

Numeric literals are not a new concept in C#. We have been able to define integer values in base-10 and base-16 since C# was first released. Common uses case for base-16 (also known as hexadecimal) literals are to define flags and bit masks in enumerations and constants. Since each digit in a base-16 number is 4 bits wide, each bit in that digit is represented by 1, 2, 4, and 8.

[Flags]
public enum Option
{
    None    = 0x00,
    Option1 = 0x01,
    Option2 = 0x02,
    Option3 = 0x04,
    Option4 = 0x08,
    Option5 = 0x10,
    Option6 = 0x20,
    Option7 = 0x40,
    Option8 = 0x80,
    All     = 0xFF
}

While this is familiar to most, using C#7 we can now express such things explicitly in base-2, more commonly referred to as binary. While hexadecimal literals are prefixed with 0x , binary literals are prefixed with 0b.

[Flags]
public enum Option
{
    None    = 0b00000000,
    Option1 = 0b00000001,
    Option2 = 0b00000010,
    Option3 = 0b00000100,
    Option4 = 0b00001000,
    Option5 = 0b00010000,
    Option6 = 0b00100000,
    Option7 = 0b01000000,
    Option8 = 0b10000000,
    All     = 0b11111111
}

Although I am used to using base-16 numbers for this, I can see value in being explicit by using binary literals. The strength comes when more than one bit is set. When using base-16, it can be easy to make a mistake and it is not immediately obvious what bits are set by a specific value2. With binary literals, it is immediately obvious without additional, potentially erroneous side calculations.

Digit Separators

Of course, binary values can get big fast and keeping track of which things line up with which can be fraught with problems. Sure, we can try to line up the values, but what if the indentation gets one space off? Will we really notice during that code review?

To help with readability like this and to assist in avoiding silly off-by-one issues that can arise due to misaligned values, C#7 introduces _ as a digit separator for all numeric literals. This separator is stripped out by the compiler; it is just syntactical candy to aid readability and serves no purpose within the compiled code. For example, our enumeration above that uses binary literals can be rewritten as follows:

[Flags]
public enum Option
{
    None    = 0b0000_0000,
    Option1 = 0b0000_0001,
    Option2 = 0b0000_0010,
    Option3 = 0b0000_0100,
    Option4 = 0b0000_1000,
    Option5 = 0b0001_0000,
    Option6 = 0b0010_0000,
    Option7 = 0b0100_0000,
    Option8 = 0b1000_0000,
    All     = 0b1111_1111
}

I think this really does help with readability although I was disappointed to find that I could not use this separator directly after the base modifier. I do not know about anyone else, but it seems more readable to separate the modifier from the actual value. Thankfully, we can pad the left of our number with zeroes as long as the value we define fits into the type we are assigning.

byte a = 0b_0000_0001;  //INVALID: Digit separator cannot be at the start or end of the value
byte b = 0b1_0000_0001; //INVALID: 257 doesn't fit in a byte
byte c = 0b0_0000_0001; //VALID: 1 fits into a byte just fine

I suspect we may start seeing code that uses this "padding plus separator" approach once C#7 gets wider acceptance as I think it really improves readability; 0b0_0001_0000 is clearer to me than 0b0001_0000.

In addition, the digit separator is not limited to just binary numeric literals; it can be used in any numeric literal. For example, use it to separate 32-bit parts of a large hexadecimal number, or as a thousands separator in a floating point value; anywhere that it improves readability.

In Conclusion

The new binary literal syntax and digit separator should help to make intent clearer and code easier to read when used appropriately. As with any language feature, we must always use our best judgement to ensure it is being used appropriately. For more information on the features covered in this post, see the official documentation where you can also discover other C#7 magic that I will be covering in my upcoming posts.

  1. Things like read-only auto-properties, expression-bodied member functions, exception filters, null-conditional operators, and the nameof operator to name a few []
  2. I know some can see in hex, and that's great, but not everyone is so adept []