Why would a code version management technique be good for website content, and how do we use it here at CANDDi?
A review process might sounds like overkill for a blog but when something is published erroneously it can be at best embarrassing, at worst a total disaster. Unfortunately there are no cached copies of Netmags infamous “How HTML5 affects SEO” article which was published without technical review, but the mocking and angry tweets are still out there. Even this week MongoDB published an article (hastily updated) claiming that “Surpassing 100GB of data in your application requires you to have in-depth knowledge of how to operate and run MongoDB”, clearly not something they really would like to have out there.
TL;DR We’re using GitFlow as a process to manage new content on our blog and website, this helps us communicate and review before releasing; as well as making large reworkings more managable.
As we covered in a previous post we’ve switched the public site for CANDDi to use Jekyll. This means all of our content is now written in Markdown files and version controlled using Git. Like nearly all other code in our business, we use GitFlow to manage version control, and the content of this site is no different.
What is GitFlow and why would I use it
GitFlow is a simple branching strategy for git. It maintains two long running branches (Master and Develop) and has three types of short lived branches (Hotfix, Feature and Release) which exist for long enough to do a piece of work and are then merged back into the code base. There is an excellent, if lengthy, description of GitFlow on nvie.com
Our Master branch always looks like live and is the only branch we deploy. We only take Hotfixes off Master, where something is “broken” in live ie. an existing article needs changing, spelling mistakes etc. This means changes like this can stil happen even if we’re knee deep in a huge overhaul.
Our Development branch is where the next release of canddi.com is before it is merged into Master. This might be small updates, like new articles or non-urgent fixes, or it might be a huge reworking (we’ve only done one of these so far). Either way, because it’s on a separate branch to Master we no longer face the twin nightmares that are
- “but we’re in the middle or rebuilding the site, can’t this change wait” - we’d do it onto Master and merge down
- “how do we migrate the database to the new structure” - the content is just code, and code migrations we understand.
These however are not regular situations, unlike writing and posting a new article.
How does a new post get written, reviewed and posted
A new article, such as this one, is basically exactly the same as a feature request. It’s new thing, which mustn’t break existing things, and shouldn’t be released until everybody is happy it is good to go. Here’s how this article was written and will be published:
All of our development work runs through a Scrum process and is managed through JIRA. This article is no different, it has a ticket in this Spirit and I’ll demo it. This might sound like overkill, but writing here is not a core part of my job (yet) and so tracking it as we track all other development means we can manage it.
Working on a branch
I’ve cut a Feature branch off Develop. All changes to this article will be committed to this branch, and eventually the branch will be merged back into Develop via Pull Request. Until it is merged back it is virtually impossible for me to publish this by mistake.
I’m currently writing this is Sublime Text with the Markdown Preview plugin because it’s what I spend half my working life using. Our designer Claire prefers to use Espresso, which is no problem, Markdown is Markdown. Non-technical staff make use of a bunch of different tools, but the hosted version of dillinger.io seems very popular.
Thanks to the separation of content and styling, it doesn’t matter that we’re all using different editors, the Markdown will be clean and the resulting HTML with be perfectly styled and without all of the nested divs, inline styling and other horrors that common web-based GUI text editors will result in (“Pasted from Word? OH HAI let me mangle that for you!”)
Review via Pull Request
I’ve committed a couple of times throughout the writing of this piece, mostly to save where I’m up to, but in a couple of cases for a much more useful reasons.
After an enlightening conversation at a recent conference with Github’s @cardioid I was introduced to the idea of the long-running PR for review. Basically I’ve opened a PR from my feature branch back to Develop after my first commit, and so all my commits can be seen and reviewed (just like all our code). This means that when I sketched the headings and content for this piece at the very start I could ping a quick “Hey, can you give this a once over” to Tim and he suggested some changes. I edited, committed and pushed, got the thumbs up and carried on.
Once I’m happy with, I’ll make a final commit, push up to GitHub and my PR will automagically update itself. I’ll then ask for it to be reviewed (I am an atrocious speller) and we might iterate the “fix, commit, push, review” loop again. However, once everybody is happy, this PR will be merged into Develop on Github.
Release, build and ship
Now the Develop branch has my article and possibly some other cool stuff we want to get out, so we create a Release branch from it and then merge this down into Master. In this case the changes are trivial, an article and a couple of tweaks, but the process is exactly the same if we were replacing the entire site. The Release branch is cut form Develop, checked (integration testing for content effectively) and merged into Master. Master is then pushed up to GitHub and we’re set to build and ship.
We build with a simple Jekyll command, and ship up the line with rsync.
The side effects of content as code
I’ve mostly talked here about the tech process of using GitFlow to manage content as code, so I think its worth pointing out some of the benefits.
Thanks to combining GitFlow with a Fork and Pull GitHub methodology it is very easy to get other people to look at your changes. No more draft articles, easy version control so you can go back to a previous iteration if needed, version control for content is something you don’t realise is cool until you have it.
Distraction free editing because there is no formatting
I used to write in JDarkRoom a lot because I found I could really get into my flow when I had no distractions. Writing in a WYSIWYG environment lead to me messing with formatting and styling endlessly. Coding environments are very often optimised for getting you into your zone and keeping you there, and I find Sublime a lovely writing environment.
Matches up with Kanban and Scrum processes nicely
More and more we’re using Kanban boards or a Scrum approach to different bits of our business and because content can now be throught of as code, writing posts just drops into the workflow perfectly. An article or a fix is a discrete piece of work and can be shipped on its own. GitFlow helps make sure that the right things ship at the right time and that no changes are lost.
Content as code avoids all the nightmares of migrates
As alluded to previously, migrating content between version of a site can be a freaking nightmare and is a complete nightmare to revert. If all your content is in flat files this gets easier. By bolting GitFlow into your working process you also avoid all the other horrors that can appear, because any changes to live will be merged down into Develop.
Worth the overhead?
Implementing GitFlow might seem like a lot of overhead, but once you’re confident working in Git (we’ll cover the tools we use for non-technical staff to use Git in another post), GitFlow is easy. You have to branch once, and learn to open Pull Requests, but after that there is virtually no overhead for your content creators. We think it’s worth every minute.