+++*

Symbolic Forest

A homage to loading screens.

Blog : Post Category : Geekery : Page 1

We can rebuild it! We have the technology! (part one)

I said the other day I’d write something about how I rebuilt the site, what choices I made and what coding was involved. I’ve a feeling this might end up stretched into a couple of posts or so, concentrating on different areas. We’ll start, though, by talking about the tech I used to redevelop the site with, and, indeed, how websites tend to be structured in general.

Back in the early days of the web, 25 or 30 years ago now, to create a website you wrote all your code into files and pushed it up to a web server. When a user went to the site, the server would send them exactly the files you’d written, and their browser would display them. The server didn’t do very much at all, and nor did the browser, but sites like this were a pain to maintain. If you look at this website, aside from the text in the middle you’re actually reading, there’s an awful lot of stuff which is the same on every page. We’ve got the header at the top and the sidebar down below (or over on the right, if you’re reading this on a desktop PC). Moreover, look at how we show the number of posts I’ve written each month, or the number in each category. One new post means every single page has to be updated with the new count. Websites from the early days of the web didn’t have that sort of feature, because they would have been ridiculous to maintain.

The previous version of this site used Wordpress, technology from the next generation onward. With Wordpress, the site’s files contain a whole load of code that’s actually run by the web server itself: most of it written by the Wordpress developers, some of it written by the site developer. The code contains templates that control how each kind of page on the site should look; the content itself sits in a database. Whenever someone loads a page from the website, the web server runs the code for that template; the code finds the right content in the database, merges the content into the template, and sends it back to the user. This is the way that most Content Management Systems (CMSes) work, and is really good if you want your site to include features that are dynamically-generated and potentially different on every request, like a “search this site” function. However, it means your webserver is doing much more work than if it’s just serving up static and unchanging files. Your database is doing a lot of work, too, potentially. Databases are seen as a bit of an arcane art by a lot of software developers; they tend to be a bit of a specialism in their own right, because they can be quite unintuitive to get the best performance from. The more sophisticated your database server is, the harder it is to tune it to get the best performance from it, because how the database is searching for your data tends to be unintuitive and opaque. This is a topic that deserves an essay in its own right; all you really need to know right now is that database code can have very different performance characteristics when run against different sizes of dataset, not just because the data is bigger, but because the database itself will decide to crack the problem in an entirely different way. Real-world corporate database tuning is a full-time job; at the other end of the scale, you are liable to find that as your Wordpress blog gets bigger as you add more posts to it, you suddenly pass a point where pages from your website become horribly slow to load, and unless you know how to tune the database manually yourself you’re not going to be able to do much about it.

I said that’s how most CMSes work, but it doesn’t have to be that way. If you’ve tried blogging yourself you might have heard of the Movable Type blogging platform. This can generate each page on request like Wordpress does, but in its original incarnation it didn’t support that. The software ran on the webserver like Wordpress does, but it wasn’t needed when a user viewed the website. Instead, whenever the blogger added a new post to the site, or edited an existing post, the Movable Type software would run and generate all of the possible pages that were available so they could be served as static pages. This takes a few minutes to do each time, but that’s a one-off cost that isn’t particularly important, whereas serving pages to users becomes very fast. Where this architecture falls down is if that costly regeneration process can be triggered by some sort of end-user action. If your site allows comments, and you put something comment-dependent into the on-every-page parts of your template - the number of comments received next to links to recent posts, for example - then only small changes in the behaviour of your end-users hugely increase the load on your site. I understand Movable Type does now support dynamically-generated pages as well, but I haven’t played with it for many years so can’t tell you how the two different architectures are integrated together.

Nowadays most heavily-used sites, including blogs, have moved towards what I supposed you could call a third generation of architectural style, which offloads the majority of the computing and rendering work onto the user’s browser. The code is largely written using JavaScript frameworks such as Facebook’s React, and on the server side you have a number of simple “microservices” each carefully tuned to do a specific task, often a particular database query. Your web browser will effectively download the template and run the template on your computer (or phone), calling back to the microservices to load each chunk of information. If I wrote this site using that sort of architecture, for example, you’d probably have separate microservice calls to load the list of posts to show, the post content (maybe one call, maybe one per post), the list of category links, the list of month links, the list of popular tags and the list of links to other sites. The template files themselves have gone full-circle: they’re statically-hosted files and the webserver sends them back just as they are. This is a really good system for busy, high-traffic sites. It will be how your bank’s website works, for example, or Facebook, Twitter and so on, because it’s much more straightforward to efficiently scale a site designed this way to process high levels of traffic. Industrial-strength hosting systems, like Amazon Web Services or Microsoft Azure, have moved in ways to make this architecture very efficiently hostable, too. On the downside, your device has to download a relatively large framework library, and run its code itself. It also then has to make a number of round-trips to the back-end microservices, which can take some time on a high-latency connection. This is why sometimes a website will start loading, but then you’ll just have the website’s own spinning wait icon in the middle of the screen.

Do I need something quite so heavily-engineered for this site? Probably not. It’s not as if this site is intended to be some kind of engineering portfolio; it’s also unlikely ever to get a huge amount of traffic. With any software project, one of the most important things to do to ensure success is to make sure you don’t get distracted from what your requirements actually are. The requirements for this site are, in no real order, to be cheap to run, easy to update, and fun for me to work on; which also implies I need to be able to just sit back and write, rather than spend long periods of time working on site administration or fighting with the sort of in-browser editor used by most CMS systems. Additionally, because this site does occasionally still get traffic to some of the posts I wrote years ago, if possible I want to make sure posts retain the same URLs as they did with Wordpress.

With all that in mind, I’ve gone for a “static site generator”. This architecture works in pretty much the same way as the older versions of Movable Type I described earlier, except that none of the code runs on the server. Instead, all the code is stored on my computer (well, I store it in source control, which is maybe a topic we’ll come back to at another time) and I run it on my computer, whenever I want to make a change to the site. That generates a folder full of files, and those files then all get uploaded to the server, just as if it was still 1995, except nowadays I can write myself a tool to automate it. This gives me a site that is hopefully blink-and-you’ll-miss-it fast for you to load (partly because I didn’t incorporate much code that runs on your machine), that I have full control over, and that can be hosted very cheaply.

There are a few static site generators you can choose from if you decide to go down this architectural path, assuming you don’t want to completely roll your own. The market leader is probably Gatsby, although it has recently had some well-publicised problems in its attempt to repeat Wordpress’s success in pivoting from being a code firm to a hosting firm. Other popular examples are Jekyll and Eleventy. I decided to go with a slightly less-fashionable but very flexible option, Wintersmith. It’s not as widely-used as the others, but it is very small and slim and easily extensible, which for me means that it’s more fun to play with and adapt, to tweak to get exactly the results I want rather than being forced into a path by what the software can do. As I said above, if you want your project to be successful, don’t be distracted away from what your requirements originally were.

The downside to Wintersmith, for me, is that it’s written in CoffeeScript, a language I don’t know particularly well. However, CoffeeScript code is arguably just a different syntax for writing JavaScript,* which I do know, so I realised at the start that if I did want to write new code, I could just do it in JavaScript anyway. If I familiarised myself with CoffeeScript along the way, so much the better. We’ll get into how I did that; how I built this site and wrote my own plugins for Wintersmith to do it, in the next part of this post.

* This sort of distinction—is this a different language or is this just a dialect—is the sort of thing which causes controversies in software development almost as much as it does in the natural languages. However, CoffeeScript’s official website tries to avoid controversy by taking a clear line on this: “The golden rule of CoffeeScript is: ’it’s just JavaScript’”.

Photo post of the (insert arbitrary time period here)

Or, back to the railway

Back to the railway and the quiet post-viral timetable it is running at the moment. One nice thing about this timetable is that it gives me the opportunity to take my camera along and photograph the trains when they’re stood still, and the station when there’s no trains about. Normally you’re too busy to have chance for that sort of thing.

Bewdley station

Pannier tank

Bewdley North signalbox

2857 at Bewdley

Photo post of the week

Dw i wedi mynd i weld Sion Corn

Up to North Wales for the weekend, to help out with the trenau Sion Corn. My Welsh isn’t good enough yet to actually speak it, but good enough to understand when I hear one of the drivers trying to persuade a small boy that the loco is actually powered by a dragon inside the firebox, a la Ivor The Engine. The boy wasn’t having any of it.

The weather was grey, steely and windy. At times you could see across the Traeth; at times visibility was down to a hundred yards or so. Naturally, the time it decided to rain sideways was about five minutes after we’d decided we’d have time to walk over to Harbour Station before the rain started.

Cleaning out the ashpit

In the middle of The Cob

Overnight the storm grew worse, and in my bunk I could hear the wind outside and the rain hammering on the window. The next morning I was up early, so we could do a short-notice early-morning shunt to get a loco out of the Old Shed; as we shunted, it was pitch-black and cold but at least the wind had died down a little. As the locos started to warm up and come to life the dawn broke to show that there seemed to be just as much water, or more, on the landward side of the embankment as on the open-sea side. The salt marshes between the Cob and the Cambrian line’s embankment were a choppy, whitecapped sea, and inland the flooding went up the Traeth almost as far as if the Cob had never been built.

Flooded fields at Pont Croesor

Late arrival

Or, missing the train

I keep meaning to tell the tale of one of the most optimistic heritage railway passengers I’ve ever seen.

I took the kids to Totnes Rare Breeds Farm last week. If you don’t know Totnes: the town is on the west bank of the River Dart. The railway running past the town, coming from Plymouth, crosses the river, and on the east bank of the river forks in two. The right-hand fork is the main line, running eastwards to the head of the Teign estuary and thence along the coast to Exeter. The left-hand fork is a steam railway which runs up the valley of the Dart as far as Buckfastleigh, famous for its abbey and its tonic wine very popular in Scotland. Just to confuse you, both railway lines were originally built by the South Devon Railway, but nowadays the steam railway is reusing that name and the main line is just, well, the main line into Cornwall. Anyway, in the V where the railway forks, just on the east bank of the river, is Totnes Rare Breeds Farm, and it has no road access, indeed, no public access at all other than via the railway. If you want to arrive on foot, you must walk to the steam railway station (they have a footbridge over the river), through the station, across a little level crossing and into the farm. The level crossing has gates, just like a full-sized one, which the railway’s signalling staff lock shut when trains are arriving and departing.

We were sat feeding boiled eggs to a 93-year-old tortoise,* even older than The Mother Grandma, with another family, when I heard a sound from the station: the sound of the vacuum ejector on the train waiting to depart. In other words, the driver had just started to release the brakes ready to go. I checked the time: just coming up to Right Time for the next train. Looks like it will be a perfect-time departure.

“We’d better get going,” said the dad of the other family, “we need to catch that train.” And they got up and left. I thought it might be a bit cruel to tell them they’d almost certainly already missed it. The gates would already be locked, and even by the time they reached them, the train would probably be moving.

* one volunteer told us it was 94 and another 92 so I’m splitting the difference

Sitting by the fire

In which we regress

So it didn’t snow. I was back on the railway yesterday, and everything went rather well. None of the equipment failed, I didn’t do anything stupid, and I didn’t drop any tokens, which is always my biggest worry. It was a relatively quiet shift; I sat in the big armchair with the coal stove roaring away next to me, handwriting a diary piece about how sitting in the big armchair with the coal stove roaring away next to me and the clock ticking on the wall reminded me of visiting my grandmother’s house on winter Saturday afternoons when I was small. I was the first person to arrive at the station; and by the time I left all the station staff had already locked up and left too, it was getting dark, and all the lights were on. Although it didn’t snow, it felt all day as if snow was potentially on the menu.

I do wish the children could come with me to the railway, but I doubt that getting them in the same room as a cast-iron coal fired stove is a good idea: it would result in severe burns and trips to casualty, if not a full-scale conflagration. It is a shame, though, that I spend all day working on the line and then am not home until after they’re in bed.

Today, well, we have a strict no-romance-on-the-14th rule in this house; so instead of doing anything special we went into town and did the usual mundane weekend shopping: new gloves for the children, some stuff from the craft shop; a new USB cable. The Child Who Likes Fairies has learned the word “gouache”.

Inconsistency

In which different tools behave in different ways

One of those days when everything seemed to go wrong at work this afternoon. Partly because of things I broke, partly because of things that other people had messed up before I got there, partly because of things that seemed to go wrong entirely by themselves.

For example - warning, dull technical paragraph ahead - I hadn’t realised that Visual Studio can cope remarkably well with slightly-corrupt solution files and will happily skip over and ignore the errors; but other tools such as MSBuild will throw the whole file out, curl up and cry into their beer. Visual Studio, whilst ignoring the error, also won’t fix it. Therefore, when git is a git and accidentally corrupts a solution file in a merge, you will have no problems at all on a local build, but mysterious and hard-to-fix total failures happen whenever you try to build on the build server.

Update, September 8th 2020: At some point I will write a proper blog post about what happened here, how to spot it is going to happen, and how to fix it, because although MSBuild is going away now we are in the .NET Core world there are still plenty of people out there using .NET Framework, and they still occasionally face this problem.

Disassembly, Reassembly

In which we try to use metaphor

The past two days at work have largely just been the long slog of writing unit tests for a part of the system which firstly, was one of the hairiest and oldest parts of the system; and secondly, I’ve just rewritten from scratch. In its non-rewritten form it was almost entirely impossible to test, due to its reliance on static code without any sort of injection.

For non-coding people for whom this is all so much “mwah mwah mwah” like the adults in Peanuts: a few weeks ago I was doing some interesting work, to whit, dismantling a creaking horror, putting its parts side by side on the workbench, scraping off the rust and muck and polishing them up, before assembling the important bits back together into a smoother, leaner contraption and throwing away all the spare screws, unidentifiable rusted-up chunks and other bits that didn’t seem to do anything. Now, though, I have the job of going through each of the newly-polished parts of the machine and creating tools to prove that they do what I think they were originally supposed to do. As the old machine was so gummed-up and tangled with spiderwebs and scrags of twine, it was impossible to try to do this before, because trying to poke one part would have, in best Heath Robinson style, accidentally tugged on the next bit and pushed something else that was supposed to be unconnected, setting off a ball rolling down a ramp to trip a lever and drop a counterweight to hit me on the head in the classic slapstick manner. All this testing each aspect of the behaviour of each part of the device is, clearly, a very important task to do, but it’s also a very dull job. Which is why an awful lot of coders don’t like to do it properly, or use it as a “hiding away” job to avoid doing harder work.

Nevertheless, today it did lead me to find one of those bugs which is so ridiculous it made me laugh, so ridiculous that you have a “it can’t really be meant to work like that?” moment and have to dance around the room slightly. I confirmed with the team business analyst that no, the system definitely shouldn’t behave the way the code appeared to. I asked the team maths analyst what he thought, and he said, “actually, that might explain a few things.”

Repetition

In which we get annoyed with AWS

The problem with writing a diary entry every day is that most weeks of the year, five days out of seven are work. It’s hard to write about work and make it interesting and different every day; and also not write about anything too confidential.

Writing about work itself would quickly become pretty dull, I fear, however interesting I tried to make it. Today, I wrestled with the Amazon anaconda. Amazon have a product called Elastic Beanstalk, which is a bit like a mini virtual data centre for a single website. You pick a virtual image for your servers, you upload a zip file with your website in it, and it fires up as many virtual servers as you like, balances the load between them, fires up new servers when load is high and shuts down spare ones when load is low. If you’re not careful it’s a good way to let people DDoS your credit card, because you pay for servers by the hour, but all-in-all it works quite well. The settings are simple, but deliberately fairly straightforward: how powerful a server do you want to run on, how many of them do you need at different times of the day, and a few other more esoteric and technical knobs to tweak. Elastic Beanstalk isn’t so much a product in itself, as a wrapper around lots of other Amazon Web Services products that you can use separately: virtual servers, server auto-scaling and inbound load balancing. The whole idea is to make tying those things together in a typical use-case a really easy thing to do, rather than having to roll your own each time and struggling with the hard bits. The only thing that’s specifically Elastic Beanstalk is the control panel and configuration on top of it, which keeps track of what you want installed on your Elastic Beanstalk application and controls all the settings of the individual components. You can still access the individual components too, if you want to, and you can even change their settings directly, but doing so is usually a Bad Idea as then the Elastic Beanstalk control layer will potentially get very confused.

Today, I found I couldn’t update one of our applications. A problem with an invalid configuration. Damn. So I went to fix the configuration - but it was invalid. So I couldn’t open it, to fix it. It was broken, but I couldn’t fix it because it was broken. Oh.

That’s how exciting work is. One line of work held up, whilst I speak to Ops, get the broken Elastic Beanstalk replaced from scratch with a working one. In theory I could have done it myself, but our Ops chap doesn’t really like his part of the world infringed unilaterally.

The woman at the desk opposite me is on a January diet. One of those diets that involves special milkshakes and lots of water all day. Personally, I’d rather have real food.

Curious problem

In which FP has an obscure font problem, in annoyingly specific circumstances

Only a day after the new garden blog went live, I found myself with a problem. This morning, I noticed a problem with it, on K’s PC. Moreover, it was only a problem on K’s PC. On her PC, in Firefox and in IE, the heading font was hugely oversized compared to the rest of the page. In Chrome, everything was fine.

Now, I’d tested the site in all of my browsers. On my Windows PC, running Window 7 just like K’s, there were no problems in any of the browsers I’d tried. On my Linux box, all fine; on my FreeBSD box, all fine. But on K’s PC, apart from in Chrome, the heading font was completely out. Whether I tried setting an absolute size or a relative size, the heading font was completely out.

All of the fonts on the new site are loaded through the Google Webfonts API, because it’s nice and simple and practically no different to self-hosting your fonts. Fiddling around with it, I noticed something strange: it wasn’t just a problem specific to K’s PC, it was a problem specific to this specific font. Changing the font to anything else: no problems at all. With the font I originally chose: completely the wrong size on the one PC. Bizarre.

After spending a few hours getting more and more puzzled and frustrated, I decided that, to be frank, I wasn’t that attached to the specific font. So, from day 2, the garden blog is using a different font on its masthead. The old one – for reference, “Love Ya Like A Sister” by Kimberly Geswein – was abandoned, rather than wrestle with getting it to render at the right size on every computer out there. The replacement – “Cabin Sketch” by Pablo Impallari – does that reliably, as far as I’ve noticed;* and although it’s different it fits in just as well.

* this is where someone writes in and says it looks wrong on their Acorn Archimedes, or something along those lines.

Vampire-Spotting

In which we suspect that some TV cameras might be taking the train

Regular readers over the past couple of years might have noticed that I quite enjoy spotting the filming locations of the paranormal TV drama* Being Human, filmed in a variety of easily-recognisable Bristol locations: Totterdown, Bedminster, Clifton, St George, College Green, and so on. Not for much longer, though, we thought: although the first two series were Bristol-based, the third series is apparently being moved over to Cardiff. Whether it will be the recognisable Cardiff Cardiff of Torchwood, or the generic anycity of Doctor Who, remains to be seen; but this was all clearly set up when, at the end of Series Two, the protagonists were forced to flee the house on the corner of Henry St and Windsor Terrace for an anonymous rural hideout. No more Bristol locations for us to spot, we thought.

Over the past week, we’ve been doing a lot of driving about moving house; we now know every intimate corner of every sensible route from south Bristol to east Bristol, or at least it feels like we do. So we were slightly surprised to see that, about a week ago, some more of these pink signs have popped up. “BH LOC” and “BH BASE”, as before.

We spotted them on Albert Road, near the Black Castle. “BH BASE” points along Bath Road, towards the Paintworks and the ITV studios. “BH LOC”, though, is intriguing. It points down the very last turning off Albert Road before the Black Castle end. That entrance only goes to two places: a KFC branch, and St Philips Marsh railway depot.

If you watched the second series of Being Human, you might remember that there was, indeed, a rather brutal train-based scene in a First Great Western carriage.** So, expect the third series to include, at the very least, an extension of that scene, if not a spin-off plotline. Or, alternatively, those signs aren’t really anything to do with Being Human at all, and it’s just coincidence that they pop up around Bristol a few months before each series appears on the telly.*** My money’s on that train from Series Two being the root of part of the Series Three plot; but, I guess, we’ll just have to wait, watch and see.

* Well, it started off as a comedy, and got more serious as it went along.

** I was impressed that the programme’s fidelity-to-location included shooting that scene in a genuine local train, rather than just finding any railway prepared to get a carriage soaked with fake blood. Of course, it was probably a convenient location too.

*** The third possibility, of course, is that someone in Series Three tries to cure vampires and werewolves of their respective curses by getting them to eat large amounts of fried chicken.