The Keyrus Blog Part I - Choosing our Stack

By Lewis Fogden, Mon 28 November 2016, in category Web development

aws, nginx, pelican, python

  
@

Now that the Keyrus UK blog has been up and running for a little while, with both the design and underlying processes settling into place, we thought it would be useful to share how exactly we went about creating it. So over the next month or two, we'll be releasing a series of articles detailing that journey.

For the first of theses posts, let's focus on the technology stack used, and the decisions behind these choices.

To be Static or not to be

The first major decision was whether to go all out with a full CMS, to build a small web app with a framework like Django, Rails, or Flask, or to follow the route of many a tech blogger with a light-weight, static blog. If you've had a few glances around, you'll know we went with the latter, but let's try to retrace the steps that lead to that decision.

CMSs (Content Management Systems)

WordPress, Joomla, Drupal... With so many choices available, even picking a CMS can seem overwhelming. And that's just the beginning - hosted or self hosted, with or without a GUI, and so on.

WordPress is undoubtedly one of the first words that springs to mind when most people think of blogging. Millions of sites run on it, and for a simple web page, it provides an fairly inexhaustible amount of customisation via its many themes and plugins. Layouts, colour schemes, SEO optimisation, caching, anti-spam... You name it, and someone's probably built it for WordPress already. True, it doesn't offer the flexibility a custom application could, but for the occasional article, that isn't always needed anyway. You can choose to self host, or have them host it for you, and even manage the whole implementation via a web GUI, not requiring a single line of code.

Joomla and Drupal are probably the next two biggest contenders, both slightly less user friendly, but more flexible systems that require a bit more coding chops. Like WordPress, they're both based on PHP and by default use MySQL as their underlying database, with customisation revolving largely around themes and extensions (Joomla) / modules (Drupal). Setup of either is a bit more extensive than WordPress, requiring you to install the database alongside the CMS and create and edit various config / settings files.

Speaking generally, CMS's are generally praised on the following points:

And where they fall down a bit is:

When thinking of the requirements for the blog, the main needs were simplicity (of both design and administration), speed (most users will quickly abandon a site with poor load times), and scalability (in case someone other than us is reading this). Hence for such a project, a CMS wasn't the right fit.

With the constant updates required, security concerns, plugin compatibility issues, and all the rest, a more minimalist solution seemed appropriate.

Web Application Frameworks

Here at Keyrus, we're big fans of Python, so perhaps building the blog with a web application frame work like Django or Falcon would've been a more obvious choice. Most Squirro projects we do feature a Flask app somewhere or other, and we've previously used Rails to develop systems for our friends at The Turing Trust.

However, coming back to the simplicity requirement above, setting up a Django or Rails app felt like overkill, and although Flask's intended purpose is for small projects, all we really needed was something that could generate html pages to be rendered to a browser. The ability for users to interact with the blog, comments aside, wasn't really on the list - logins, uploads, shopping carts, and other such functionality just wasn't needed.

Static Site Generators

You know where this is heading. The growing popularity of static site generators such as Jekyll, Hugo, and Octopress over recent years for blogging and micro-sites demonstrates the power in their simplicity. In some form or another, each static generator takes an input (rst or markdown for instance), pushes that input through some form of templating engine (Liquid) and asset pipeline, and outputs plain html files alongside any statics such as CSS, JavaScript, and images.

Why is this a good thing? While its true that this approach removes some of the flexibility of the frameworks listed above, what is offers in return are the following:

Of course, its not all sunshine and bright. You are trading the following:

However, with 3rd party tools like Disqus, Isso, and SwiftType allowing you to outsource your comments and other dynamic content, moving the code from the server to the client, these constraints aren't nearly as constricting as they might seem at first. Sure, if you need comments, search, upload forms, social media, chat, and so on, its probably best to build a web app than make use of 6 different outsourced services, but if you need just the one...

The next decision to make was then, which generator to use. When considering any tool, library, or language, the best choice takes into account the skills of the people who might work with it and the environment it will sit within. Due to the widespread presence of Python skills in the company, Pelican seemed the obvious choice. We could have confidence that a range of people could easily contribute to or take over development.

Pelican also uses the Jinja2 templating engine, alongside Flask and Ansible, so that familiarity was an added bonus, one less thing to learn. It's maturity meant there were a vast array of themes already in existence, and integration with tools like Disqus, Twitter, and Google Analytics was right off the bat.

The only real issue was the input format. As with most static site generators, markdown (or rst) was the preferred language. While the syntax is simple to learn, getting everyone in the company to learn and submit articles in markdown, from the technical consultants to the sales team, would've been a project in itself. Hence a .docx to .md converter was needed. That's a topic for another post though.

HTTP, At Your Service

As there are probably entire sections of the internet devoted to what the best http server to run a simple static website are, let's not go into that. We looked briefly at Apache and NGINX, the two most popular Linux based http servers, and settled on NGINX for its quick processing of static files and ability to handle many concurrent connections. For a good comparison of the two, check out Digital Ocean's tutorial.

Infrastructure

As with all things, when it comes to choosing infrastructure, you've got your fair share of choices. Presuming you don't have your own in-house servers (we use ours just for internal applications), you'll need to choose a cloud vendor. IaaS (Infrastructure as a Service) and PaaS (Platform as a Service providers such as AWS, Digital Ocean, Heroku, Engine Yard, Rackspace, and Google Compute Engine provide virtual environments that can be deployed, scaled, and configured using either simple web interfaces or calls to their APIs.

When choosing a vendor, the main thing to consider is how close to the metal you want to get. Do you want your environment fully configured for you, and simply have to push over your application with git? Heroku's dynos might be what you're after. Can you find your way around the command line and like full control over your environment? Digital Ocean's droplets or AWS's EC2 instances might be more to your taste. Looking for something in-between? Rackspace or Engine Yard's cloud platforms might take your fancy.

We chose AWS to host our blog for several reasons. Firstly, it provides a simple, user-friendly web GUI, a CLI, and has an abundance of Ansible modules for automatic configuration / deployment. This combination allowed us to set up the initial server with just a few clicks, and after determining the requirements needed in the environment, write a playbook to automate its creation. Secondly, Amazon's competitive pricing structure is rather hard to beat (and is most comparable to Digital Ocean's). Our last point is familiarity once again. The existing knowledge of using AWS in the company, for full projects, POC's, demo and test environments, means that a wide range of people could assume control of the blog if needed. As with many things, sometimes the best technology to use is the one you already know.

Source Control

Another topic with no right answer, whether you use Git, Subversion, Mercurial, of even Microsoft's Team Foundation Server, the important thing is that you are using source control. Git is our preferred flavour, and combined with a Bitbucket repo, it provides all of the security and traceability you would expect from a VCS, as well as a centralised place for the team to commit to.

Updates to the blog can be pulled straight off of Bitbucket, and both the Bitbucket web interface and Atlassian's SourceTree desktop application allow us to explore our commit history in a more visual way. For walkthroughs on how to set-up, use, and migrate to Git, check out the tutorials here.

Summary

To recap on the rather lengthy explanation above, our blog stack can be roughly summed up as:

Look out for 2nd part of this series in the next few weeks, where we'll be further discussing Pelican, Python's leading static blog generator.