C#7: Tools

I have spent the first couple of months of 2017 learning about the new features in C#7. This would not have been possible without some tools to help me play around with the new language syntax and associated types. Since we have to wait a little longer until Visual Studio 2017 is released, I thought you might like to know what tools I have been using to tinker in all things C#7.

LINQPad Beta

Link: http://www.linqpad.net/Download.aspx#beta

While early releases of Visual Studio 2017 (scheduled for release on March 7th) support the language, I initially found the release candidate to be unstable and frustrating. Not only that, but it can be cumbersome to spin up a quick example using Visual Studio, so I turned to my trusty friend, LINQPad.

LINQPad Beta showing me my return is missing a ref
LINQPad Beta showing me my return is missing a ref

I cannot recommend LINQPad enough, it is a fantastic tool for prototyping, poking around data sources, and more besides, like tinkering with language features you don't yet understand. While LINQPad's current release only supports the C# language up to version 61, the beta release also supports C# version 7. Not only can you use the language, but with the fantastic analysis window, you can see how Roslyn breaks down each part of the code. If you want to get started quickly, easily play around with the cool new features, and have a powerful tool for digging deeper as the need arises, the LINQPad beta is the tool to get.

Visual Studio 2017 RC

Link: https://www.visualstudio.com/downloads/

Visual Studio 2017 RC splash
Visual Studio 2017 RC splash

Yes, I know I said it was unstable and frustrating, but that was before, way back in January. These days the RC is much, much better and with the release date set for March 7th, there was never a better time to install Visual Studio 2017 RC and get a head start on getting to know some of the new things it can do, including C# 7. Tuples are fun, but poking around with them in the debugger is funner.


Link: https://oz-code.com/

Pattern matching in OzCode
Pattern matching in OzCode

It is no secret that I love OzCode, the magical debugging extension for Visual Studio. It is so well-known that they asked me to be part of their OzCode Magician community program. So, it should come as no surprise that I have been using OzCode in my exploration of C#7. As the Visual Studio 2017 RC has matured, the clever people over at CodeValue have been creating previews of OzCode version 3, including amazing LINQ debugging support. Recently, I got to try an internal build that included support for all the cool new things in C#7.

OzCode 3 will be released on March 7th, the same day as Visual Studio 2017.


Link: https://docs.microsoft.com/dotnet/articles/csharp/csharp-7

Never underestimate the power of reading documentation, it is one of the best tools out there. For my C#7 posts, I relied heavily on the new docs.microsoft.com site, specifically the .NET articles on C#. Not only is this a fantastic resource, but it has built-in support for commenting on the documentation so that you can ask questions and contribute to their improvement.

In Conclusion

This is the entire list of tools I used for my C#7 investigations. Try them out and get an early start on C#7 fun before the March 7th release of all the C#7 goodness. Happy tinkering and if you stumble on any useful tools, please share in the comments!

  1. It also supports SQL, F#, and VB 

Octokit, Merge Commits, and the Story So Far

In the last post we had reduced our commits by matching them against pull requests; next, we can look for noise in the commit message content itself. Although I have been using the Octokit.NET repository as the target for testing with its low noise, high quality commit messages, we can envisage a less consistent repository that has some noisy commits. For example, how often have you seen or written commit messages like "Fixed spelling", "Fixed bug", or "Stuff"1?

How we detect these noisy commits is important; if our filtering is too simple, we remove too many things and if it is too strict, we remove too few. Rather than go deep into one specific implementation, I just want to introduce the idea of filtering based on message content. In the long term, I think it would be interesting to apply learning algorithms,  but I'm sure some simple, configurable pattern matching should suffice2.

If I run the filtering I have described so far3 on the Octokit.NET latest release, this is what we get:

The value of this is clearer if we see the commit list before processing:

The work so far has reduced a list of 135 commits down to 58, and so far, it looks like we have not lost any really useful "release note"-worthy information. However, the eagle-eyed among you may noticed that our 58 messages contain duplicate information. This is because each pull request is listed twice; once for the pull request title I inserted in place of its individual commits, and again for the merge commit that merged that pull request. These merge commits are not filtered out because they do not belong to the commits inside the pull request. Instead, they are an artifact of merging the pull request4.

At first, I thought the handy MergeCommitSha property of the pull request would help, but it turns out this refers to a test merge and is to be deprecated5. Instead, I realised that the messages I wanted to remove all had "Merge pull request #" in them, followed by the pull request number. This seems like a perfect use case for our pattern matching filtering. Since we have the pull requests, we could use their numbers to match each merge message exactly, but I decided to do the simpler thing of excluding any message starting with "Merge pull request #".

Filtering for messages that begin with "Merge pull request #" gives us a shortlist of just 31 messages:

I think this is a pretty good improvement over the raw commit list. Combining this list with links back to the relevant commits and pull requests should enable someone to discern the content of a release note much faster than using the raw commit list alone. I will leave that as an exercise or perhaps a future post. As always, thanks for reading. If you find yourself using Octokit to trawl your own repositories for release note information, I would love to hear about it in the comments.

  1. We're all friends here, you can admit it 

  2. The filtering should be configurable so that we can tailor it to the repository we are processing 

  3. excluding the last step of filtering by message content 

  4. Perhaps stating the obvious 

  5. https://developer.github.com/v3/pulls/ 

Octokit and Noise Reduction with Pull Requests

Last time in this series on Octokit we looked at how to get the commits that have been made between one release and another. Usually, these commits will contain noise such as lazy commit messages and merge flog ("Fixed it", "Corrected spelling", etc.), merge commits, or commits that formed part of a larger feature change submitted via pull request. Rather than include all this noise in our release note generation, I want to filter those commits and either remove them entirely, or replace them with their associated pull request (which hopefully will be a little less noisy).

Before we filter out the noise, it seems prudent to reduce the commits to be filtered by matching them to pull requests. As with commits, we can query pull requests using a specific set of criteria; however, though we can request the results be sorted a certain way, we cannot specify a date range. To get all the pull requests that were merged before our release, we need to query for all the pull requests and then filter by date locally.

This query can be slow, since we are getting all closed pull requests in the repository. We could speed it up by providing a base branch name in the query criteria. However, to remove as much commit noise as possible, I would like to include pull requests that were merged to a different branch besides just the release branch1. We could make things more performant by managing a list of active release branches and then querying pull requests for each of those branches only rather than the entire repository, but for now, we will stick with the less optimal approach as it keeps the code examples a little cleaner.

Before we can start filtering our commits against the pull requests, we need to get the commits that comprise each pull request. When requesting a collection of items (like we did for pull requests), the GitHub API returns just enough information about each item so that we can filter and identify the ones we really care about. Before we can do things with other properties on the items, we have to request additional information. More information on each pull request can be obtained about a specific pull request by using the Get, Commits, Files, and Merged calls. The Get call returns the same type of objects as the GetAllForRepository method, except that all the data is now populated instead of just a few select properties; the Merged call returns a Boolean value indicating if the PR has been merged (equivalent to the Merged property populated by Get); the Files method returns the files changed by that pull request; and the Commits method returns the commits.

At this point, things are looking pretty good: we can get a list of commits in the release and a list of pull requests that might be in the release. Now, we want to filter that list of commits to remove items that are covered by a pull request. This is easy; we just compare the hashes and remove the matches.

Using the collection of commits for the latest release, we join the commits from the pull requests using the SHA hash and then select all release commits that have no matching commit in the pull requests2. However, we don't want to lose information just because we're losing noise, so we have to maintain a list of the pull requests that were matched so that we can build our release note history. To keep track, we will hold off on discarding any information by pairing up commits in the release with their prospective pull requests instead of just dropping them.

Going back to where we had a list of pull requests merged prior to our release, let us revisit getting the commits for those pull requests and this time, pairing them with the commits in the release to retain information.

Now we have a list of commits paired with their parent pull request, if there is one. Using this we can build a more meaningful set of changes for a release. If I run this on the latest release of the Octokit.NET repository and then group the commits by their paired pull request, I can see that the original list of 135 commits would be reduced to just 58 if each commit that belonged to a pull request were bundled into just one entry.

Next, we need to process the commits to remove those representing merges and other noise. These are things to discuss in the next post of this series where perhaps we will take stock and see whether this effort has been valuable in producing more meaningful release note generation. Until then, thanks for reading and don't forget to leave a comment.

  1. often changes are merged forward from one branch to another, especially if there are multiple release branches to support patch development and such 

  2. The join in this example is an outer join; we are taking the join results and using DefaultIfEmpty() to supply an empty collection when there was nothing to join 

Octokit and the Content of Releases

I started out my series on Octokit by defining a goal; to use GitHub repository history to build a basic summary of changes contained in a release. In order to do this, we need to define what a release is and then determine how we get the pertinent information to say what changes that release contains.

At a basic level, a release is a tagged point in the git repository. GitHub takes this one step further by making a release a first class concept as a lightweight git tag with additional attributes like a title and release notes. Octokit even allows first class access to GitHub releases in a repository, like so:

Great! With a little extra code, we can determine which release was the latest and then get all the commits in that release.

In the above code, we use MoreLinq to get the most recent release and then request all the commits in the repository on the same branch as that release up until the date the release was created. We request these commits using a CommitRequest object that specifies the query parameters. In this case, we want all the commits until the date of the release for the tag on which the release was made1. Of course, this will include everything ever done in that branch since the beginning of time, which is a bit of information overload. What we really want are the commits since the previous release.

Now we have taken the releases and used their CreatedAt dates to determine the most recent two and used the previous release date to set the Since date in our request. However, this code still has a flaw; we never said what branch the releases should be from. For all we know, the most recent two releases are on entirely different branches. To fix that, we need to filter the releases to just the branch we want.

The highlighted line is where we filter on the appropriate branch (it took some investigation to discover that the TargetCommitish property of a release is its branch name). We now have just the commits for the release branch we care about between the most recent release and the one before it.

In the next post, we will look at reducing the noise in the commit history using pull requests. Until then, thank you for stopping by and don't forget to leave a comment.


  1. The Sha property of the CommitRequest can be either a commit hash or branch/tag name 

Octokit and the Documentation Nightmare

Before I get into the meat of this series of posts, I would like to set the scene. Like many organisations that perform some level of software development these days, we use GitHub. Here at CareEvolution, some developers use the web interface extensively, some use the command line, and others use the GitHub desktop client1, but most use a combination of two or more, depending on the task. This works great for developers, who have each found a comfortable workflow for getting things done, but it is not so great for those involved with DevOps, QA, or documentation where there is a need to find out user-friendly details of what the developers did. Quite often, a feature or bug fix involves several commits and while each has a comment or two, and perhaps an associated pull request (PR) or issue has a general description, but there is no definitive list of "this is what release X contains" that can be presented to a customer. Not only that but sometimes a PR or issue is resolved in an earlier release and merged forward. While we have lists of what a release is going to include, quite often there is more detail that we would like to include, and we often have additional changes as we adapt to the changing requirements of our customers. All this means that one or more people end up trawling the commits, trying to determine what the changes are. It is not a happy task.

"There is nothing more difficult to take in hand, more perilous to conduct, or more uncertain in its success, than to take the lead in the introduction of a new order of things."

Niccolo Machiavelli
The Prince (1532)

Now, I know that this could all be avoided if people documented changes more clearly, perhaps added release notes to commits, raised issues for documentation changes, or created release notes on the release when it is made. However, no matter how noble change may be, anyone who has worked in process definition for any length of time will know that changing the behaviour of people is the hardest task of all, and therefore it should be avoided unless absolutely necessary. It was with that in mind that I decided mining the existing data for information would be an easier first step than jumping straight to asking people to change. So, with the aim of making life a little easier, I started looking at ways to automate the trawling.

I figured that by throwing out noisy and typical developer non-descriptive commits like "fixed spelling" or "updated comment", and by combining commits under the corresponding PR or issue, I could create useful summary of changes. This would not be customer-ready, but it would be ready for someone to turn into a release note without needing to trawl git history. In fact, if I included details of who committed the changes, it might even provide a feedback loop that would improve the quality of developer commit messages; developers do not like interruptions, so anyone asking for more detail on a commit they made should start to reinforce that if they wrote better commits, PRs, issues, they would get less interruptions.


Octokit .NET logoAfter a dismissing using git locally to perform this task (I figured those who might need this tool would probably not want to get the repository locally) and reading up on the GitHub API a little, I cracked open LINQPad —my tool of choice for hacking— and went looking for a Nuget package to help. It was during that search that I happily stumbled on Octokit, the official GitHub library for interacting with the GitHub API. At the time of writing, Octokit reflects the polyglot nature of GitHub users, providing variants for Ruby, .NET, and Objective C, as well as experimental versions for Python, and Go. I installed the Octokit Nuget package into LINQPad and started hacking (there is also a reactive version for IObservable fans).

Poking around the various objects, and reading some documentation on GitHub (Octokit is open source), I got a feel for how the library wrapped the APIs. Though, I had not yet got any code running, I was making progress. Confident that this would enable me to create the tool I wanted to create, I started writing some code to gather a list of releases for a specific repository and stumbled over my first hurdle; authentication. It turns out it is not quite as straight-forward as I thought (the days of username and password are quite rightly behind us3), and so, my adventure began.

And then…

This is a good place to stop for this week, I think. As the series progresses, I will be piecing together the various parts of my "release note guidance" tool and hopefully, end up with a .NET library to augment Octokit with some useful history mining functionality. Next time, we will take a look at authentication with Octokit (and there will be code).

  1. OSX and Windows variants 

  2. or, James Bond for kids 

  3. OK, that's a lie, but I want to encourage good behaviour