In the last post we had reduced our commits by matching them against pull requests; next, we can look for noise in the commit message content itself. Although I have been using the Octokit.NET repository as the target for testing with its low noise, high quality commit messages, we can envisage a less consistent repository that has some noisy commits. For example, how often have you seen or written commit messages like "Fixed spelling", "Fixed bug", or "Stuff"1?
How we detect these noisy commits is important; if our filtering is too simple, we remove too many things and if it is too strict, we remove too few. Rather than go deep into one specific implementation, I just want to introduce the idea of filtering based on message content. In the long term, I think it would be interesting to apply learning algorithms, but I'm sure some simple, configurable pattern matching should suffice2.
If I run the filtering I have described so far3 on the Octokit.NET latest release, this is what we get:
Fix the credit format Release notes for release 0.17.0 Merge pull request #972 from naveensrinivasan/json-serialization Json serialization for Unicode Merge pull request #976 from octokit/elbaloo-better-merge-exception-rebased better merge exception rebased Merge pull request #973 from naveensrinivasan/appveyornuget Generate nuget packages on appveyor Merge pull request #917 from alfhenrik/feature-webhookhelper Add helper class for creating web hooks Merge pull request #807 from octokit/codeformatter added a tailored CodeFormatter to Octokit Merge pull request #956 from octokit/vs2015-support VS2015 migration Merge pull request #921 from naveensrinivasan/samples Adds octokit samples Merge branch 'gitignore-exception' Merge pull request #918 from willsb/download-timeout Adds overloads to GetArchive for adding custom timeouts Merge pull request #957 from octokit/clean-up-some-fixes clean up some pending PRs Merge pull request #943 from naveensrinivasan/AssetDownload Fixes for Downloading ReleaseAsset zip File Merge pull request #942 from alfhenrik/bug-repohasissues Make NewRepository.HasIssues nullable as it's optional Merge pull request #940 from naveensrinivasan/build-sh Created build.sh Merge pull request #929 from elbaloo/issue-389 Add .com links to PrivateRepositoryQuotaExceededException Merge pull request #927 from naveensrinivasan/octokit-logo Updated with the logo Merge pull request #922 from naveensrinivasan/fixes-for-fake-warning Fixes for FAKE Xunit warning Merge pull request #919 from adamralph/system-framework-assembly add System to required framework assemblies for net45 Merge pull request #909 from willsb/disposable-repositories Disposable repositories Merge pull request #916 from octokit/consolidate-committer-info Consolidate committer info Merge pull request #915 from octokit/docs Add a bunch of XML doc comments Merge pull request #907 from naveensrinivasan/encodedcontent-public-#861 Making Encodedcontent public #861 Merge pull request #908 from khellang/clarify-failing-convention-tests Clarify why convention tests are failing Merge pull request #906 from naveensrinivasan/update-readme Updated the readme with reactive octokit. Merge pull request #903 from willsb/commit-committer Changes GitHubCommit.Author/Committer Merge pull request #902 from naveensrinivasan/build-mono Build fix for Xamarin Studio Solution Merge pull request #901 from alfhenrik/feature-issueeventsurl#885 Add Events URL to the Issue class. Merge pull request #900 from alfhenrik/update-testtargetnames-in-docs Updated test target names in the shipping releases doc Merge pull request #898 from octokit/release Release of v0.16 - ironic ties better merge exception rebased Generate nuget packages on appveyor Json serialization for Unicode Add helper class for creating web hooks added a tailored CodeFormatter to Octokit VS2015 migration clean up some pending PRs Fixes for Downloading ReleaseAsset zip File Adds overloads to GetArchive for adding custom timeouts Make NewRepository.HasIssues nullable as it's optional Created build.sh Gitignore exception Add .com links to PrivateRepositoryQuotaExceededException Updated with the logo Adds octokit samples Fixes for FAKE Xunit warning add System to required framework assemblies for net45 Disposable repositories Consolidate committer info Add a bunch of XML doc comments Making Encodedcontent public #861 Clarify why convention tests are failing Updated the readme with reactive octokit. Changes GitHubCommit.Author/Committer Build fix for Xamarin Studio Solution Add Events URL to the Issue class. Updated test target names in the shipping releases doc Release of v0.16 - ironic ties
The value of this is clearer if we see the commit list before processing:
Fix the credit format Release notes for release 0.17.0 Merge pull request #972 from naveensrinivasan/json-serialization Json serialization for Unicode Merge pull request #976 from octokit/elbaloo-better-merge-exception-rebased better merge exception rebased Merge branch 'better-merge-exception-rebased' of https://github.com/elbaloo/octokit.net into elbaloo-better-merge-exception-rebased Merge pull request #973 from naveensrinivasan/appveyornuget Generate nuget packages on appveyor The test targets were deleting the nuget packages The test targets were deleting the nuget packages so had to include the CreatePackages at the end. Removed the disable on PR. Create packages in turn calls build app Create packages in turn calls build app so no need to call it. appveyor nuget packages appveyor nuget packages Checked for the serialized data Compared if the serialized data has what was expected. Not just deserialized data. Tests for Unicode character serialization Tests for Unicode character serialization Fixes for json serialization bug Fixes for json serialization issue when unicode is present. Merge pull request #917 from alfhenrik/feature-webhookhelper Add helper class for creating web hooks A bit of code cleanup Add unit test to ensure correct message is returned when duplicate keys exists. Throw exception with helpful message if duplicate webhook config values exists. Fix up XML comments as per PR review Conform NewRepositoryWebHook to new request model guidelines Update existing integration test to use new web hook helper class Add unit tests Add helper class to create a web hook. Fixes octokit/octokit.net#914 Merge pull request #807 from octokit/codeformatter added a tailored CodeFormatter to Octokit aaaand format the code skip unicode character editing format the code in the script local install of code formatter Merge pull request #956 from octokit/vs2015-support VS2015 migration Merge pull request #921 from naveensrinivasan/samples Adds octokit samples Merge branch 'gitignore-exception' Merge pull request #918 from willsb/download-timeout Adds overloads to GetArchive for adding custom timeouts a bit more cleanup of the README one more malformed xml-docs tag Merge branch 'master' into vs2015-support Merge pull request #957 from octokit/clean-up-some-fixes clean up some pending PRs address feedback added tests for the merged qualifier added "Merged" in searchissues which allows search repos by merged date with existing syntax. it generates a CA1502 code excessive complexity warning and i suppressed it. Fixed the problem in the constructor. run build fixproject Added NewArbitraryMarkdown class, RenderArbitraryMakrdown method and unit tests for it. added assignee property to pull request. tidy up some xml-docs while i'm in here actually some real errors just suppressing some warnings, nbd update README to indicate we're using VS2015 update the target to use netcore451 bump the ToolsVersion bump to netcore451 tweak ignore file update to the latest MSBuild scripts Merge branch 'master' into better-merge-exception-rebased Merge pull request #943 from naveensrinivasan/AssetDownload Fixes for Downloading ReleaseAsset zip File Fixed the spacing Fixed the spacing of comma and aligned the arguments. Fixes for Downloading ReleaseAsset zip File #854 This commit addressed the `BuildResponse` wasn't handling response `content-type` `application/octet-stream` for binary items. Merge branch 'master' into download-timeout Make new merge exceptions inherit from 'Octokit.ApiException' Affect 'Octokit.PullRequestNotMergeableException' and 'Octokit.PullRequestMismatchException' Merge pull request #942 from alfhenrik/bug-repohasissues Make NewRepository.HasIssues nullable as it's optional Make HasIssues nullable as it's optional Merge pull request #940 from naveensrinivasan/build-sh Created build.sh Created build.sh Included build.sh to build form non-windows :poop: brainfart Add tests for merge exceptions to PullRequestsClientTests Add System.Net namespace used to check for HttpStatusCode in PullRequestClient.Task<PullRequestMerge> Merge(string, string, int, MergePullRequest) sketching out the exception necessary when raising specific merge exceptions Changes the way the exception is verified Merge pull request #929 from elbaloo/issue-389 Add .com links to PrivateRepositoryQuotaExceededException Add .com links to PrivateRepositoryQuotaExceededException Add following links: - 'Deleting a repository' at https://help.github.com/articles/deleting-a-repository/ - 'What plan should I use?' at https://help.github.com/articles/what-plan-should-i-choose/ Merge pull request #927 from naveensrinivasan/octokit-logo Updated with the logo Changed the octokit logo to smaller size Updated with the logo Updated it with the logo Validate Linqpad Samples as part of CI Validates Linqpad Samples as part of CI for every commit. Removed the integration test options Removed the integration test options because lprun has compileonly option. The nuget package includes the samples This will include the samples in the nuget package. Throwing proper exception on RepositoresClient Merge pull request #922 from naveensrinivasan/fixes-for-fake-warning Fixes for FAKE Xunit warning Merge remote-tracking branch 'origin/fixes-for-fake-warning' into fixes-for-fake-warning Conflicts: build.fsx Fixes for fake warning Fixes for the FAKE warning Adds InvalidGitIgnoreTemplateException Fixes for fake warning Fixes for the FAKE warning Including LINQPad.exe Including LINQPad.exe to compile the samples after every commit Fixed the command line args Fixed the args parameter to compile using lprun.exe linqpad samples Linqpad samples Removes integer overload Plus extra ensures Merge pull request #919 from adamralph/system-framework-assembly add System to required framework assemblies for net45 add System to required framework assemblies for net45 Adds overloads for adding custom timeouts Merge pull request #909 from willsb/disposable-repositories Disposable repositories Merge pull request #916 from octokit/consolidate-committer-info Consolidate committer info Merge remote-tracking branch 'octokit/master' into disposable-repositories Conflicts: Octokit.Tests.Integration/Clients/DeploymentStatusClientTests.cs Octokit.Tests.Integration/Clients/DeploymentsClientTests.cs Octokit.Tests.Integration/Clients/PullRequestsClientTests.cs Refactors the remaining test classes Add doc comments for Author and Committer Move Committer into Common folder This object is used both in requests and responses. Add a README for model objects Replace SignatureResponse and CommitEntity with Committer A recent PR added CommitEntity but we already had SignatureResponse expressly for this purpose. So this commit renames SignatureResponse to Committer and removes CommitEntity and replaces it with Committer. Merge pull request #915 from octokit/docs Add a bunch of XML doc comments Add this PR number for these fixes So meta! Add Description to OrganizationUpdate Add Before property to NotificationsRequest Added Description property to NewTeam Teams can have descriptions! Added Content property to NewTreeItem Add a bunch of doc comments We get a lot of build output because of missing XML comments that we ignore. I'd like to stop ignoring them. To do that, we need to doc the :poop: out of everything. Deployment state is required for deployment status Breaking change. This constructor parameter is now required. Add missing properties to NewDeployment Added `RequiredContexts`, `Environment`, and `Task` parameters. Removed the obsolete `Force` parameter. Also made ref a required constructor parameter. This is a breaking change. Add the ability to create a readonly deploy key Rename Message to CommitMessage According to the docs (https://developer.github.com/v3/pulls/#merge-a-pull-request-merge-button), this should be sent as "commit_message" thus we need to name it `CommitMessage` Fixes #913 Refactors tests up to PullRequestsClientTests Adds common properties to RepositoryContext A lot of classes use the name and the owner of the repository, so for consistency I added those as properties of the Context Refactors a whole bunch of tests Refactors AssigneesClient and CommitsClient tests Refactors BranchesClientTests Refactors StatisticsClient Refactors GithubClient and RepositoryContents Merge pull request #907 from naveensrinivasan/encodedcontent-public-#861 Making Encodedcontent public #861 Refactors RepositoriesClientTests Changes the tests in RepositoriesClientTests to use the new using block syntax RepositoryContext class and Extension methods fix for making the setter private fix for making the setter private Merge pull request #908 from khellang/clarify-failing-convention-tests Clarify why convention tests are failing Clarify why convention tests are failing Making EncodedContent public Making EncodedContent public to get the raw bytes of a file. #861 Merge branch 'octokit/master' of https://github.com/naveensrinivasan/octokit.net into octokit/master Merge pull request #906 from naveensrinivasan/update-readme Updated the readme with reactive octokit. Update read with reactive octokit. Updated the readme to include the nuget reference to Octokit.Reactive Merge pull request #903 from willsb/commit-committer Changes GitHubCommit.Author/Committer Merge pull request #902 from naveensrinivasan/build-mono Build fix for Xamarin Studio Solution Merge pull request #901 from alfhenrik/feature-issueeventsurl#885 Add Events URL to the Issue class. Makes integrations tests happy Build fix for Xamarin Studio Solution Build fix for Xamarin Studio Solution Creates CommitEntity for GitHubCommit Creates the entity that corresponds to the actual payload returned by the server to represent the Author and Committer of a commit Merge pull request #900 from alfhenrik/update-testtargetnames-in-docs Updated test target names in the shipping releases doc Add Events URL to the Issue class. Update the names of the test targets Merge branch 'master' into octokit/master Merge pull request #898 from octokit/release Release of v0.16 - ironic ties Update FAKE and SourceLink
The work so far has reduced a list of 135 commits down to 58, and so far, it looks like we have not lost any really useful "release note"-worthy information. However, the eagle-eyed among you may noticed that our 58 messages contain duplicate information. This is because each pull request is listed twice; once for the pull request title I inserted in place of its individual commits, and again for the merge commit that merged that pull request. These merge commits are not filtered out because they do not belong to the commits inside the pull request. Instead, they are an artifact of merging the pull request4.
At first, I thought the handy `MergeCommitSha` property of the pull request would help, but it turns out this refers to a test merge and is to be deprecated5. Instead, I realised that the messages I wanted to remove all had "Merge pull request #" in them, followed by the pull request number. This seems like a perfect use case for our pattern matching filtering. Since we have the pull requests, we could use their numbers to match each merge message exactly, but I decided to do the simpler thing of excluding any message starting with "Merge pull request #".
Filtering for messages that begin with "Merge pull request #" gives us a shortlist of just 31 messages:
Fix the credit format Release notes for release 0.17.0 Merge branch 'gitignore-exception' better merge exception rebased Generate nuget packages on appveyor Json serialization for Unicode Add helper class for creating web hooks added a tailored CodeFormatter to Octokit VS2015 migration clean up some pending PRs Fixes for Downloading ReleaseAsset zip File Adds overloads to GetArchive for adding custom timeouts Make NewRepository.HasIssues nullable as it's optional Created build.sh Gitignore exception Add .com links to PrivateRepositoryQuotaExceededException Updated with the logo Adds octokit samples Fixes for FAKE Xunit warning add System to required framework assemblies for net45 Disposable repositories Consolidate committer info Add a bunch of XML doc comments Making Encodedcontent public #861 Clarify why convention tests are failing Updated the readme with reactive octokit. Changes GitHubCommit.Author/Committer Build fix for Xamarin Studio Solution Add Events URL to the Issue class. Updated test target names in the shipping releases doc Release of v0.16 - ironic ties
I think this is a pretty good improvement over the raw commit list. Combining this list with links back to the relevant commits and pull requests should enable someone to discern the content of a release note much faster than using the raw commit list alone. I will leave that as an exercise or perhaps a future post. As always, thanks for reading. If you find yourself using Octokit to trawl your own repositories for release note information, I would love to hear about it in the comments.