Breaking The Code

Last week saw the return of Learn Something, a monthly software hacking event held at the offices of Fanzoo Technology in Ann Arbor. Each month, attendees can choose to hack on their own projects or take part in the monthly Learn Something challenge. Teams and pairing are encouraged, providing opportunities to work with new people, tools and techniques.

The message we had to decode
The message we had to decode

This month the challenge was to decipher a message encoded using a simple substitution cipher. Participants started with a copy of the encoded message (known as the ciphertext) and a text file of words from the English language. The message (shown above) is one example of a cipher, known as 'alienese', that appeared in several episodes of Futurama. Though fans of the show have already solved the cipher and we could have searched the Internet for the solution, our goal at Learn Something was to create a software solution that could solve it (or at least narrow down the possibilities1).

What is a simple substitution cipher? A simple substitution cipher is a way of encoding a message by replacing (substituting) each letter of the alphabet with a different letter or symbol. For example, if we had the substitutions `t` to `a`, `h` to `g`, and `e` to `o`, the word `the` would become `ago`.

Alex Zolynsky, one of the organizers of Learn Something, provided a clearer print out of the message with punctuation symbols written in red (see below). The intention was to ignore the punctuation and focus on the actual words, so before we started, one of the participants assigned a letter to each symbol such that we had the message written in a more familiar alphabet (and one that was easier to type on our laptops).

The annotated message
The annotated message

The resulting message read:

`abbc bdefg hgij kble cmna ompf mlc pangaebc jpkgai nb qgo emq cmllgf`

Clearly, our work was far from done, but before I continue with this blog, I need to come clean.

At Learn Something last week, I was buried in some work of my own, and other than the occasional interjection, I was not involved in the challenge. However, at the end of the night, I took the cipher home with me and the following evening, I got stuck in. It took me about three hours to crack the message. How did I do it?

Well, there are many ways to tackle breaking a substitution cipher. For example, if the ciphertext is long, we could use frequency analysis to gain some hints at the substitutions2. However, this ciphertext was short and as such, frequency analysis was not particularly helpful3. Instead, I decided to follow the same track as the participants at Learn Something and take the longest word in the phrase then find words in English that had the same pattern and use that to start reducing the possibilities. So, I opened up LINQPad (my tool of choice for hacking around), and got started.

The longest word in the ciphertext is `pangaebc`. To match words that follow the same pattern, I treated the word as its own ciphertext and made another substitution, just as the symbols had been substituted for letters earler. This allowed me to normalize different words to see if they matched the same patter. Only words that normalized to the same string would be of the same pattern. So, for `pangaebc`, the normalized version is `abcdbefg`. Examples of English words that match this pattern are `airfield`, `windiest`, and the gross`pustular`, each of these has the pattern `abcdbefg` when normalized. In fact, in the dictionary I used there were over 500 matches, over 500 possibilities for the word that could be in our decoded message. Each of the possible matches for the longest word provided a possible part of the substitution cipher. The next step was to take another word from the ciphertext and find English words that matched the pattern of that ciphertext word as well as the substitutions found from the first word, `pangaebc`. This started to build up a tree of candidate words for the decoded message. Each node in the tree contained words that match the substitutions required by its parent node but also matched the pattern of a word in the ciphertext. By recursing this approach for each word in the ciphertext sentence, a tree of possible plaintext4 sentences could be generated.

Once all words in the ciphertext had been processed, I could take the branches of the resulting tree that included the same number of nodes as words in the ciphertext and determine the possible substitutions that might give the right plaintext solution. This produced 15 possible sentences. It was then up to me to read each one and pick the one that made the most sense. Of course, I'm not going to spoil it by telling you the solutions I found. Instead, I encourage you to give the challenge a go for yourselves and see what you come up with (we both know you could cheat by searching the Internet for an answer, but you're better than that, right?).

I really enjoyed tackling this problem. Not only was it a fun distraction, but now I have a LINQPad query that can solve any substitution cipher as long as I know what language in which the message is written. I am definitely looking forward to the next time I attend Learn Something. Hopefully, your interest is piqued and I will see you there. In the meantime, if you give this challenge a go, I would love to hear how you tackled it, what you did differently to me, and what you learned. Until next time, thanks for reading.

Featured image: "Confederate cipher disk" by RadioFan (talk) – I (RadioFan (talk)) created this work entirely by myself. Licensed under CC BY-SA 3.0 via Wikipedia.

  1. to fully solve it programmatically would have needed our software solutions to have understanding of English sentence structure, which was thankfully outside the scope of the challenge []
  2. assuming we know the language in which it is written []
  3. I did try it just to see, but frequency analysis needs a longer ciphertext than we had for this challenge []
  4. decoded text []

Drop the BOM: A Case Study of JSON Corruption in WordPress

GiveCampIn September, I attended Ann Arbor Give Camp, a local event that connects non-profits with the local developer community to fulfill technological goals. As part of the project I was working on, I installed a plugin called CiviCRM into a WordPress deployment that was running on an IIS-based server.

It turned out that WordPress integration for CiviCRM was relatively new and a problem unique to IIS-based deployments existed after installation. This led to a white screen when I tried to access CiviCRM. I spent some time troubleshooting and eventually found the issue after I edited two files to track it down. The fix was quickly implemented. Unfortunately, I then discovered that some other features were not working properly.

The primary places this new issue surfaced were in displaying dialog windows within CiviCRM. It turned out that these dialogs obtained their UI via an AJAX call that returned some JSON and for some reason, jQuery was indicating that the call failed. Investigating further, I saw that the API call was successful (it returned a 200 status result) and the JSON appeared completely fine. How strange.

JSON in binary editor of Visual Studio
JSON in binary editor of Visual Studio

I made some debug changes to the JavaScript using the Google Chrome development tools and looked at the failure method jQuery was calling. In doing so, I discovered jQuery was reporting a parsing error for the JSON result. This seemed bizarre, after all, the JSON looked fine to me. I decided to verify it by copying and pasting it into Sublime. Still, the JSON looked just fine. Being tenacious, I saved the JSON to a text file and then opened it in Visual Studio's binary editor and there, the problem appeared. There were two characters at the start of the file before the first brace: byte order marks.

Corrupted JSON in Google Chrome developer tools
Corrupted JSON in Google Chrome developer tools

A byte order mark (often referred to as a BOM) is a Unicode character used to indicate the endianness (byte order) of a text file or stream1. JSON is not supposed to include them at all. In hindsight, I could have seen this issue much sooner if I had paid closer attention to the JSON response in the Network tab of Chrome's developer tools. This view had shown two red dots (see above) before the opening brace, each dot corresponding to a BOM that Chrome knew shouldn't be there. Of course, I had no idea what they meant and so I promptly ignored them. Lesson learned.

So, armed with the knowledge of why the JSON was causing parser errors, I had to find out what was causing this malformation and fix it. After reading about how a BOM in an incorrectly formatted PHP file2 could cause BOMs to be prepended in the PHP output, I started looking at each PHP file that would be touched when generating the API response. Alas, nothing initially stood out. I was getting frustrated when I had an epiphany; I had edited exactly two files in trying to fix the installation issue and there were exactly two BOMs. Coincidence?

I went to the two files that I had edited, downloaded them and discovered they both had BOMs. I re-saved them, this time without a BOM and uploaded them back to the site, which fixed the JSON corruption and got the CiviCRM plug-in in to working order.

In tracking down and fixing this self-made issue, I learned a few valuable lessons:

  1. Learn to use my developer tools
  2. Never assume it is not my fault
  3. It pays to understand how things work

Hopefully, my misfortune in this one incident will help someone track down their own issue with corrupted JSON in WordPress. If so, please share in the comments. Together, our mistakes can be someone else's salvation.

  1. Wikipedia – http://en.wikipedia.org/wiki/Byte_order_mark []
  2. one saved as Unicode with byte order mark []

Ann Arbor Give Camp 2014

Last weekend saw the return of Ann Arbor GiveCamp. This community event, organised and hosted by Jay Harris, Hilary Weaver, Ken Patton, and their minions, is held at Washtenaw Community College who generously accommodate Give Camp on the third weekend in September every year (18th-20th September, 2015, put it in your calendars).

GiveCampGive Camps are events where volunteers from the tech industry including developers, designers, and artists come together to help out local non-profits with new websites, social media, and other technically-centric projects that might otherwise take dollars away from their primary mission. This year, the combined efforts of the volunteers donated nearly $150000 of effort1.

The non-profit our team was earmarked to work with this year unfortunately had to pull out of the event, so, we turned our efforts elsewhere. Each year, Give Camp receives proposals from non-profits that are just not feasible within the 48 hour time-frame and a common theme of many proposals is donor and donation management. While there are many professional solutions available, they often carry a heavy price tag, taking money that otherwise might go to the non-profit's primary mission.

This kind of software is not simple. Most packages contain features to track donors, volunteers, relationships, donations, events, campaigns, and a lot more besides. Given the obvious complexity and nuances of this kind of specialized accounting software, our expectations of achieving much over 48 hours were low. We decided our main goal would be to flesh out a basic design that could be used to kick start some future effort at some future Give Camp. This plan soon changed.

CiviCRMThe very first step we took was to research the available solutions. Thanks to the Googlefu of one team member, known to our team as Dr. Mylastname2, among the large number of commercial offerings we discovered CiviCRM, a free and open source solution with a very extensive and supportive community. Not only that, but CiviCRM supported WordPress integration, making it a natural fit for Give Camp projects3.

On discovering CiviCRM, our mission changed from specifying some future development effort to evaluating an existing solution. By the end of the weekend, we had two test sites up and running, and a solid set of recommendations for how CiviCRM might be deployed to non-profits at future Give Camps. The project was a great example of how things can change at Give Camp.

As with the two previous Ann Arbor Give Camps I have attended, the team in which I participated faced some unique and intriguing challenges, ultimately reshaping and redefining what could be achieved. I met old friends, made new ones, and saw what can happen when people choose to focus on others. It never ceases to amaze me just how adaptable people can be when failure is ruled out and that is just one of many things that makes Give Camp a fantastic experience. I look forward to next year and hope that you will join us to make it even better.

  1. based on average consulting rates []
  2. pronounced, meelastname []
  3. WordPress is the CMS of choice for non-profit websites at Ann Arbor Give Camp []

CodeMash 2.0.1.4

Adventure

It is almost nine years since I first set foot in the US. It was through that experience that I rediscovered the joy in challenging myself and embracing change, something I had not so strongly felt since I first started singing in a band. So, while I had faced challenges before as a result of my own decisions, none had been bigger. Even though the opportunity had been provided by someone else, it had been my choice to take it and to see it through1.

It took me a while to settle in to my new home (or even to acknowledge it as home), but I eventually joined the developer community in Ann Arbor and the wider mid-west region. The interaction with other developers has continued to provide challenging opportunities and encourage positive change within my career, as well other aspects of my life. It was through the basic act of attending one local Ann Arbor .NET Developers Group meeting and the people I met there that I learned about CodeMash.

CodeMash

CodeMash v2.0.1.4 logo
 
The CodeMash conference – a community-organized event held annually in Sandusky, Ohio – never fails to provide unique experiences or challenges. My first CodeMash, CodeMash v2.0.1.2 was unique because I had never attended a developer conference before (or any other conference), and CodeMash v2.0.1.3 provided a completely new experience when, after attending a fantastic workshop on public speaking, I went on to win the PechaKucha contest.

This year, I was guaranteed yet another unique experience when I was accepted to be a speaker. I am extremely grateful to friends, mentors and others for their support and encouragement leading up to speaking at CodeMash v2.0.14. It was a wonderful honor that I thoroughly enjoyed, and while it changed my CodeMash experience with the added anxiety of speaking and subsequent release when my session ended, I would definitely do it again if the chance arose.

To those that attended my talk on AngularJS for XAML developers, thank you. I  hope that you found it valuable. If you were there or if you have an interest, you can find my slide deck and code on GitHub (Deck|Code).

I am very grateful to the volunteers that organize and run CodeMash each year, as well as the many friends and mentors that have guided my own CodeMash experiences and the many other experiences within the developer community. Without these people, I would not have had such amazing opportunities, nor would I have learned how important it is to challenge myself and strive for new experiences. It is always uncomfortable to embrace change, but the rewards of doing so are often worth the pain.

To close, I encourage you to challenge yourself this year. Make sure to let me know in the comments below how you will challenge yourself and perhaps we can follow-up at the end of the year.

  1. Of course, there were many times in the weeks between being offered the position and setting foot in the US when I considered changing my mind, including just after the plane doors closed []

Bloomin' Marvelous

Chrissy posing in the arid house
Chrissy posing in the arid house

Last April, Chrissy and I took an impromptu trip to one of our favourite places in the Ann Arbor area, The University of Michigan Matthaei Botanical Gardens. Rather than wander the trails, enjoy the gardens or pop inside the MiSo house, we decided to take a stroll through the conservatory. Recently changed to be a free exhibit, the conservatory is bursting with fascinating and colourful plants from around the world and well worth a leisurely visit should you have the time.

Given that this was a spur of the moment trip, I didn't even think to bring a camera so here is a selection of photographs taken using my phone1 (I doubt having a camera would've made much difference other than to prolong our visit while I pretended I could get a perfect shot).

In hindsight, I should have taken notes on what I was photographing, but I didn't; perhaps you can identify some of the plants yourselves and post a helpful comment. Please enjoy.

  1. HTC Evo 4G []