Octokit and Noise Reduction with Pull Requests

Last time in this series on Octokit we looked at how to get the commits that have been made between one release and another. Usually, these commits will contain noise such as lazy commit messages and merge flog ("Fixed it", "Corrected spelling", etc.), merge commits, or commits that formed part of a larger feature change submitted via pull request. Rather than include all this noise in our release note generation, I want to filter those commits and either remove them entirely, or replace them with their associated pull request (which hopefully will be a little less noisy).

Before we filter out the noise, it seems prudent to reduce the commits to be filtered by matching them to pull requests. As with commits, we can query pull requests using a specific set of criteria; however, though we can request the results be sorted a certain way, we cannot specify a date range. To get all the pull requests that were merged before our release, we need to query for all the pull requests and then filter by date locally.

This query can be slow, since we are getting all closed pull requests in the repository. We could speed it up by providing a base branch name in the query criteria. However, to remove as much commit noise as possible, I would like to include pull requests that were merged to a different branch besides just the release branch1. We could make things more performant by managing a list of active release branches and then querying pull requests for each of those branches only rather than the entire repository, but for now, we will stick with the less optimal approach as it keeps the code examples a little cleaner.

var prRequest = new PullRequestRequest
{
    State = ItemState.Closed,
    SortDirection = SortDirection.Descending,
    SortProperty = PullRequestSort.Updated
};

var pullRequests = await gitHubClient.PullRequest.GetAllForRepository("RepositoryOwner", "RepositoryName", prRequest);
var pullRequestsPriorToRelease = pullRequests
    .Where(pr => pr.MergedAt < mostRecentTwoReleases[0].CreatedAt);

Before we can start filtering our commits against the pull requests, we need to get the commits that comprise each pull request. When requesting a collection of items (like we did for pull requests), the GitHub API returns just enough information about each item so that we can filter and identify the ones we really care about. Before we can do things with other properties on the items, we have to request additional information. More information on each pull request can be obtained about a specific pull request by using the `Get`, `Commits`, `Files`, and `Merged` calls. The `Get` call returns the same type of objects as the `GetAllForRepository` method, except that all the data is now populated instead of just a few select properties; the `Merged` call returns a Boolean value indicating if the PR has been merged (equivalent to the `Merged` property populated by `Get`); the `Files` method returns the files changed by that pull request; and the `Commits` method returns the commits.

var commitsForPullRequest = await gitHubClient.PullRequest.Commits("RepositoryOwner", "RepositoryName", pullRequest.Number);

At this point, things are looking pretty good: we can get a list of commits in the release and a list of pull requests that might be in the release. Now, we want to filter that list of commits to remove items that are covered by a pull request. This is easy; we just compare the hashes and remove the matches.

var commitsNotInPullRequest = from commit in commitsInRelease
                              join prCommit in prCommits on commit.Sha equals prCommit.Sha into matchedCommits
                              from match in matchedCommits.DefaultIfEmpty()
                              where match == null
                              select commit;

Using the collection of commits for the latest release, we join the commits from the pull requests using the SHA hash and then select all release commits that have no matching commit in the pull requests2. However, we don't want to lose information just because we're losing noise, so we have to maintain a list of the pull requests that were matched so that we can build our release note history. To keep track, we will hold off on discarding any information by pairing up commits in the release with their prospective pull requests instead of just dropping them.

Going back to where we had a list of pull requests merged prior to our release, let us revisit getting the commits for those pull requests and this time, pairing them with the commits in the release to retain information.

var commitsFromPullRequests = from pr in pullRequestsPriorToRelease
                              from commit in github.PullRequest.Commits("RepositoryOwner", "RepositoryName", pr.Number).Result
                              select new {commit,pr};

var commitsWithPRs = from commit in commitsInRelease
                     join prCommit in commitsFromPullRequests on commit.Sha equals prCommit.commit.Sha into matchedPrCommits
                     from matchedPrCommit in  matchedPrCommits.DefaultIfEmpty()
                     select new
                     {
                         PullRequest = match?.pr,
                         Commit = commit
                     };

Now we have a list of commits paired with their parent pull request, if there is one. Using this we can build a more meaningful set of changes for a release. If I run this on the latest release of the Octokit.NET repository and then group the commits by their paired pull request, I can see that the original list of 135 commits would be reduced to just 58 if each commit that belonged to a pull request were bundled into just one entry.

Next, we need to process the commits to remove those representing merges and other noise. These are things to discuss in the next post of this series where perhaps we will take stock and see whether this effort has been valuable in producing more meaningful release note generation. Until then, thanks for reading and don't forget to leave a comment.

  1. often changes are merged forward from one branch to another, especially if there are multiple release branches to support patch development and such []
  2. The `join` in this example is an outer join; we are taking the join results and using `DefaultIfEmpty()` to supply an empty collection when there was nothing to join []

Use Inbox, Use Bundles, Change Your Life

I love Google Inbox. At work, it has enabled me to banish pseudo-spam1, prioritize work, and be more productive. At home, it has enabled me to quickly find details about upcoming trips including up-to-date flight information, remind myself of tasks that I need to do around the house, and generally be more organized. While Inbox doesn't replace Gmail for all situations, it does replace it as a day-to-day email client. Gmail is awesome; Inbox is awesomer.

The main feature of Inbox that has enabled me to build a healthier relationship with my email (and achieve virtual Inbox Zero without having to make email management my day job) is Bundling. Bundles in Inbox are a lot like filters that label emails in Gmail; they allow you to group emails based on rules. Not only can you define filters to determine what emails go in the bundle, but you can also decide if the bundled messages go to your inbox (and how often you see the bundle) or if they are just archived away.

Inbox Sidebar
Inbox Sidebar

Inbox comes with some standard bundles: Purchases, Finance, Social, Updates, Forums, Promos, Low Priority, and Trips. Trips is a magic bundle that you cannot configure; it gathers information about things like hotel stays, flight reservations, and car rentals, and combines them into trips like "Weekend in St. Louis" or "Trip to United Kingdom". The other bundles, equivalent to the tabs that Gmail added a year or two ago, automatically define the filters (and as such, those filters are uneditable), but do allow you to control the behavior of the bundle.

When bundled emails appear in your Inbox, they appear as a single item that can be expanded to view the emails inside. You can also mark the entire bundle as done2, if you desire. These features mean you can bundle emails in multiple bundles and have those bundles appear in your Inbox as messages arrive, once a day (at a time of your choice), once a week (at a time and day of your choice), or never3 (bundling some of the pseudo-spam and only having it appear once a day or once a week has drastically improved the signal-to-noise ratio of my email).

Trips
Trips
Trip to NYC
Trip to NYC

While the default bundles are useful, the real power is in defining your own. You can start from fresh or you can use an existing label. Each label is shown in Inbox as unbundled and has a settings gear that allows you to setup bundling rules. In my example, I added a rule to label emails relating to Ann Arbor .NET Developers group.

Label settings without any bundle rules
Label settings without any bundle rules
Adding a filter rule
Adding a filter rule
Label settings showing bundling rules
Label settings showing bundling rules

With the bundle defined, every email that comes in matching the rule will be labelled and added to the bundle, which will appear in my inbox whenever a message arrives. Any messages I mark as done are archived4, removing them from the main inbox. However, they can be seen quickly by clicking the bundle name in the left-hand pane. This is great, except for one thing. The bundle definition only works for new emails as they arrive. It does not include messages you have received before the bundle was setup.

This just felt untidy to me so I was determined to fix it. As it turns out, Gmail provides all the tools to complete this part of Inbox bundles. Since each bundle is a set of filter rules and a label, you can actually edit those filters in Gmail, and Gmail includes the additional ability of applying that rule to existing emails.  To do this, go to Gmail and click the Settings gear, then the Filters tab within settings.

Bundling filters in Gmail
Filters that define my bundle in Gmail

Find the filters that represent your new bundling rules, then edit each one in turn. On the first settings box for the filter, click continue in the lower right. On the following screen, check the "Apply to matching conversations" checkbox and click the "Update filter" button.

First filter edit screen
First filter edit screen
Last filter editing screen
Last filter editing screen
Apply to existing messages
Apply to existing messages

After performing this action for each of my bundles, I returned to Inbox and checked the corresponding bundles; all my emails were now organised as I wanted.

In summary, if you haven't tried Inbox (and you have an Android or iPhone; a strange limitation that I wish Google would lift), I highly recommend spending some time with it, getting the bundles set up how you want, and using it as your primary interface for your Google email. The combination of bundling with the ability to treat emails as tasks (mark them as done, snooze them, pin them) and see them in a single timeline with your Google reminders makes Inbox a powerful yet simple way to manage day-to-day email. Before Inbox, I abandoned Inbox Zero long ago as a "fake work" task that held no value whatsoever, my Gmail inbox had hundreds of read and unread emails in it. Now that I have Inbox, I reached Inbox Zero in a day with minimal effort that one might consider to be just "reading email". I'm not saying Inbox Zero is valuable, I'm just saying that it is realistically achievable with Inbox because Inbox gets daily email management right.

Use bundles, change your life.

  1. I use the term "pseudo-spam" to describe those emails that you don't want to necessarily totally banish as spam so that you can search them later, but that aren't important to you at all such as support emails for projects you don't work on, or wiki update notifications []
  2. One of the great features of Inbox is the ability to treat emails as tasks, adding reminders to deal with them later or mark them as done; this makes drop,delegate,defer,do a lot easier to manage []
  3. If a bundle is marked as never, it is considered "unbundled" and works just as a filter that applies a label []
  4. This can be changed to "Move to Trash" in the Inbox settings []

My Maps from Google

Google provides some extremely useful online tools that many of us have come to rely on. From Sheets, Slides, and Docs, to Gmail, Maps, and Keep, the advertising giant tends to cover all the angles. The most recent Google tool that I have started using is an improvement over an old service called My Places that started out as a part of Google Maps. It is called My Maps and provides users with the means to build custom annotated maps. Any maps you create can then be embedded into websites or shared via email and social media1.

When first visiting the site, you are offered an option to either create a new map or open an existing map. Any places you had in My Places are already transferred to My Maps and available to open if you wish.

Initial view after creating a map
Initial view after creating a map

Creating a new map presents you with a familiar Google Maps-style view but with additional overlays for editing the map. These are together at the top-left in two distinct groups. The first is the map structure where you can view and edit the name of the map and its layers, as well as add new layers and adjust the appearance of the base map layer. When I tried this, there were nine different base maps available; from left to right, top to bottom these are Map, Satellite, Terrain, Light Political, Mono City, Simple Atlas, Light Landmass, Dark Landmass, and Whitewater.

Map structure with "Base map" menu opened
Map structure with "Base map" menu opened

To edit the map name, map description, or layer name, just click the text.

Dialog for editing the map title and description
Dialog for editing the map title and description

Below the description are options to add a new layer and to share the map. There is also a drop down that provides options to open or create a map, delete, export, embed, and print the current map. Below that, each layer is shown. The layer drop down can be used to rename or delete the layer as well as view a data table of items on that layer. The data table allows you to add additional information about various things that have been added to the map (two columns for name and description are provided by default, but more can be added).

The My Maps toolbox
The My Maps toolbox

Next to the map structure is the toolbox. The toolbox contains a search bar, allowing you to find the area of the map on which to base your customizations. Below the search bar are buttons to undo, redo, select and manipulate, add places, draw lines, add directions, and measure distances and areas. Using these tools, you can build up map layers. When building the Stonehenge map for my blog post on our trip there, I was able to not only search for add mark existing places, but also add custom places. Each item added goes into the selected map layer, which is indicated by a colored bar on the left edge of the layer in the map structure. Clicking a layer changes the layer being edited.

Adding and editing a layer, showing the layer selection bar on the left edge
Adding and editing a layer, showing the layer selection bar on the left edge

The appearance of each item in a layer can be modified, either as a group or individually, by manipulating the styles option at the top of the layer and clicking the paint can icon on an individual item. By editing the layer style, you can also choose which column in the data table for that layer provides text for the items in that layer, and style items based on data in the data table (useful for representing data on the map). There is a lot of scope in this area, so I recommend playing around with it and seeing what works for your specific use case.

Editing the appearance of items on a layer is easy
Editing the appearance of items on a layer is easy

Once you are happy with the map you have created, you can share it, export it to KML (for use in Google Earth and other apps that support KML), and embed into websites. The main share options are familiar to anyone who has shared a document from Sheets, Slides, or Docs, allowing you to share a link to the map as well as control who can edit and view it. If you want to embed the map in a website, an embed code is provided via the map menu, however, as the site will tell you, you need to make the map public before you can embed.

Assigning permissions is consistent with other Google apps
Assigning permissions is consistent with other Google apps

All in all, I found My Maps a pleasant discovery and really nice to use. The styling options and ability to add additional data allow for some impressive customization. I am certainly going to use this application more in the future. How about you? Leave a comment and let me know your experiences with this new addition to Google's collection of online applications, or perhaps add details of alternatives that are out there.

  1. Those planning to use Google My Maps for commercial use should review Google permissions and license terms before proceeding []

Some of my favourite tools

Update: This post has been updated to recognise that CodeLineage is now maintained by Hippo Camp Software and not Red Gate Software as was originally stated.

If you know me, you might well suspect this post is about some of the idiots I know, but it is not, this is entirely about some of the tools I use in day-to-day development. This is by no means an exhaustive list, nor is it presented in any particular order. However, assuming you are even a little bit like me as a developer, you will see a whole bunch of things you already use, but hopefully there is at least one item that is new to you. If you do find something new and useful here, or you have some suggestions of your own, please feel free to post a comment.

OzCode

OzCode is an add-in for Visual Studio that provides some debugging super powers like collection searching, adding computed properties to objects, pinning properties so that you don't have to go hunting in the object tree, simpler tracepoint creation, and a bunch more. I first tried this during beta and was quickly sold on its value. Give the 30-day trial a chance and see if it works for you.

Resharper

This seems to be a staple for most C# developers. I was a late-comer to using this tool and I am not sure I like it for the same reasons as everyone else. I actually love Resharper for its test runner, which is a more performant alternative to Visual Studio's built-in Test Explorer, and the ability to quickly change file names to match the type they contain. However, it has a lot of features, so while this is not free, give the trial a chance and see if it fits.

Web Essentials

Another staple for many Visual Studio developers, Web Essentials provides lots of support for web-related development including enhanced support for JavaScript, CSS, CoffeeScript, LESS, SASS, MarkDown, and much more. If you do any kind of web development, this is essential1.

LinqPad

I was late to the LinqPad party, but gave it a shot during Ann Arbor Give Camp 2013 and within my first hour or two of using it, dropped some cash on the premium version (it is very inexpensive for what you get). Since then, whether it is hacking code or hacking databases, I have been using LinqPad as my standard tool for hacking.

For code, it does not have the overhead of creating projects and command line, WinForms or WPF wrapper tools that you would have to do in Visual Studio. For databases, LinqPad gives you the freedom to use SQL, C#, F# or VB for querying and manipulating your database as well as support for many different data sources beyound just SQL Server, providing an excellent alternative to SQL Management Studio.

LinqPad is free, but you get some cool features if you go premium, and considering the sub-$100 price, it is totally worth it.

JustDecompile

When Red Gate stopped providing Reflector for free, JetBrains and Telerik stepped up with their own free decompilers for poking around inside .NET code. These are often invaluable when tracking down obscure bugs or wanting to learn more about the code that is running when you did not write it. While JetBrains' dotPeek is useful, I have found that JustDecompile from Telerik has a better feature set (including showing MSIL, which I could not find in dotPeek).

Chutzpah

Chutzpah is a test runner for JavaScript unit tests and is available as a Nuget package. It supports tests written for Jasmine, Mocha, and QUnit, as well as a variety of languages including CoffeeScript and TypeScript. There are also two Visual Studio extensions to provide Test Explorer integration and a handy context menu. I find the context menu most useful out of these.

Chutzpah is a great option when you cannot leverage a NodeJS-based tool-chain like Grunt or Gulp, or some other non-Visual Studio build process.

CodeLineage

CodeLineage is a free Visual Studio extension from Hippo Camp Software2. Regardless of your source control provider, CodeLineage provides you with a simple interface for comparing different points in the history of a given file. The simple interface makes it easy to select which versions to compare. I do not use this tool often, but when I need it, it is fantastic.

FileNesting

This Visual Studio extension from the developer of Web Essentials makes nesting files under one another a breeze. You can set up automated nesting rules or perform nesting manually.

I like to keep types separated by file when developing in C#. Files are cheap and it helps discovery when navigating code. However, this sometimes means using partial classes to keep nested types separate, so to keep my solution explorer tidy, I edit the project files and nest source code files. I also find this useful for Angular directives, allowing me to apply the familiar pattern  of organizing code-behind under presentation by nesting JavaScript files under the template HTML.

Whether you have your own nesting guidelines or want to ensure generated code is nested under its corresponding definition (such as JavaScript generated from CoffeeScript), this extension is brilliant.

Switch Startup Project

Ever hit F5 to debug only to find out you tried to start a non-executable project and have to hunt for the right project in the Solution Explorer? This used to happen to me a lot, but not since this handy extension, which adds a drop down to the toolbar where I can select the project I want to be my startup project. A valuable time saver.

MultiEditing

Multi-line editing has been a valuable improvement in recent releases of Visual Studio, but it has a limitation in that you can only edit contiguous lines at the same column location. Sometimes, you want to edit multiple lines in a variety of locations and with this handy extension, you can. Just hold ALT and click the locations you want to multi-edit, then type away.

Productivity Power Tools

Productivity Power Tools for Visual Studio have been a staple extension since at least Visual Studio 2008. Often the test bed of features that eventually appear as first class citizens in the Visual Studio suite, Productivity Power Tools enhances the overall Visual Studio experience.

The current version for Visual Studio 2013 provides support for colour printing, custom document tabs, copying as HTML, error visualization in the Solution Explorer, time stamps in the debug output margin, double-click to maximize and dock windows, and much more. This is a must-have for any Visual Studio user.

  1. yes, I went there []
  2. though it was maintained by Red Gate when I first started using it []

Caching with LINQPad.Extensions.Cache

One of the tools that I absolutely adore during my day-to-day development is LINQPad . If you are not familiar with this tool and you are a .NET developer, you should go to www.linqpad.net right now and install it. The basic version is free and feature-packed, though I recommend upgrading to the professional version. Not only is it inexpensive, but it also adds some great features like Intellisense1 and Nuget package support.

I generally use LINQPad as a simple coding environment for poking around my data sources, crafting quick coding experiments, and debugging. Because LINQPad does not have the overhead of a solution or project, like a development-oriented tool such as Visual Studio, it is easy to get stuck into a task. I no longer write throwaway console or WinForms apps; instead I just throw together a quick LinqPad query. I could continue on the virtues of this tool2, but I would like to touch on one of its utility features.

As part of LINQPad , you get some useful methods and types for extending LINQPad , dumping information to LINQPad's output window, and more. Two of these methods are LINQPad.Extensions.Cache and Utils.Cache. Using either Cache method, you can execute some code and cache the result locally, then use the cached value for all subsequent runs of that query. This is incredibly useful for caching the results of an expensive database query or computationally-intensive calculation. To cache an IEnumerable<T>  or IObservable<T>  you can do something like this:

var thingThatTakesALongTime = from x in myDB.Thingymabobs
                              where x.Whatsit == "thingy"
                              select x;
var myThing = LINQPad.Extensions.Cache(thingThatTakesALongTime);

Or, since it's an extension method,

var myThing = thingThatTakesALongTime.Cache();

For other types, Util.Cache  will cache the result of an expression.

var x = Util.Cache(()=> { /* Something expensive */ });

The first time I run my LINQPad code, my lazily evaluated query or the expression is executed by the Cache method and the result is cached. From then on, each subsequent run of the code uses the cached value. Both Cache methods also take an optional name for the cached item, in case you want to differentiate items that might otherwise be indistinguishable (such as caching a loop computation).

This is, as I alluded earlier, one of many utilities provided within LINQPad that make it a joy to use. What tools do you find invaluable? Do you already use LINQPad ? What makes it a must have tool for you? I would love to hear your responses in the comments.

Updated to correct casing of LINQPad, draw attention to Cache being an extension method for some uses, and adding note of Util.Cache3.

  1. including for imported data types from your data sources []
  2. such as its support for F#, C#, SQL, etc. or its built-in IL disassembly []
  3. because, apparently, I am not observant to this stuff the first time around. SMH []