+++*

Symbolic Forest

A homage to loading screens.

Blog : Post Category : Geekery : Page 1

Crossing things off (part two)

What, continuing with a craft project instead of starting a new one?

For once, I have managed to continue on with the ongoing craft projects without starting any new ones for, ooh, must be nearly a couple of months now. Most of the crafting time has been devoted to the cross-stitch project I mentioned back in July. Despite a break for my holiday—because it’s too large to go in the luggage—I’ve got on quite a way with it. Here’s the progress to date.

Progress on the new cross stitch project

It’s quite hard to take a decent photograph of, because that black background greatly confuses any camera which attempts any degree of automation. Maybe I should try telling them to use Night Mode.

Video killed the documentation star

Despite its popularity, video is really not the best way for a lot of people to learn things

Recently I added Aria Salvatrice to the list of links over in the menu, because I’m always looking to find new interesting regular reads, especially ones that use old-fashioned blogging. In this case, I found myself reading one of its posts which I absolutely found myself nodding along to. It was: Video Tutorials Considered Harmful, about how videos are a much worse venue for learning a technical topic than written documentation.

In general, I agree wholeheartedly with this, with an exception that I’ll come to below. Aria gets to what I think is the nub of the problem: that for some people, with some forms of neurodiversity, it’s really, really hard to focus on the video enough to take it in properly and digest it, and far too easy to get distracted. Your mind just wanders off, in a way that doesn’t happen—or at least not as much—if you’re reading a written text. All of a sudden, you realise that your head has been completely elsewhere for the last five minutes, and you have no idea what you’re watching any more.

What I find strange about this in the tech world, though, is that neurodiversity is hardly rare among software developers and similar professions. This is definitely something that has come up with my current colleagues more than once: the fact that a good proportion of us have this same problem: if we start watching an explanatory video, our minds wander off. All of a sudden, we’ve missed a huge chunk of everything and have no idea where we are. If this is so common among tech practitioners, why are these types of video common in the tech world?

The Plain People of the Internet: But don’t you yourself there have your own YouTube channel?

Yes, I do, but I don’t use it to try to teach you things. Not technical things, at any rate. They are turned into text and posted here, or wherever is most relevant. I don’t create videos of myself lecturing to camera.

That brings me onto another aspect of this, though: the difference between good and bad videos, and how bad videos make things ten times worse. Now, I haven’t posted anything on YouTube for quite a long time, but that’s largely because of the effort involved in making a video that I think is good enough to put out there. In short: I edit. I don’t just live-record a video of me doing something, chat as I go along and upload it; instead I edit. I cut it down, I write a narration, I record and edit that and stitch the whole thing together so that a project that took me several days in real life becomes a ten-minute video. In the sort of tech videos I’m talking about, this often doesn’t happen. Aria writes about this in its original post:

[M]ost video is entirely improvised, and almost never cut to remove wasted time. People’s thoughts meander. Their explanations take five sentences to convey what a single one could have said with more clarity. They wait on software to load, and make you wait along. They perform a repetitive task six times, and make you watch it six times, they perform a repetitive task six times, and make you watch it six times, they perform a repetitive task six times, and make you watch it six times […] And while it is easy to skip repetitive text, it is difficult to know where to skip ahead in a video.

Because, actually editing that down, writing a script, making it concise and informative is itself a skill, a hard one to learn. It’s difficult work. Much easier to just video a stream-of-consciousness ramble and push the whole thing up to the Internet unedited. And that’s why people do it: it’s similarly easier than writing good documentation. Knowing how to explain something you know well, to someone who knows little about it, is also a surprisingly difficult skill that a lot of people don’t even realise they don’t have.

This doesn’t necessarily apply with videos demonstrating physical things that are much harder to describe than to show, by they way. Crafting tutorials, for example, such as How To Crochet A Magic Ring. Even in that case, though, the good ones are carefully edited, brief, clear and concise.

In short, what I’m saying is that video has taken over (to some extent) from written documentation because if you’re willing to accept low quality, it’s much easier to produce, even if the results are worthless. It’s inevitably lower-quality, though, because of all the flaws in the format mentioned in Aria’s piece, such as lack of searchability. It’s accidentally low quality because if the creators put the effort in to make it good, it would take as much or more effort than writing good textual documentation takes. Those flaws can be fixed by putting the effort in and learning to skills to make a good video; but the inherent flaws of the format can’t be changed. Better all round to produce written documentation from the start.

Crossing things off

Finish craft projects? Nah. Start new ones? Yes please

There are still numerous craft projects somewhere in mid-flight at Symbolic Towers, and I keep slowly gathering plans for more that I haven’t even started yet. I have enough crochet patterns to keep me crocheting for several years, probably; a very large cross-stitch under way, and several other cross-stitch kits ready to start—and that’s to say nothing of the Lego or the model train kits. None of these things, really, have been posted on here, largely because I think “I’ll save them for YouTube” and then never video them either.

Despite all that, I’ve just started yet another cross stitch project!

What’s exciting about this project, the reason why it’s using up most of my crafting energy at the moment, is that: for the first time, this isn’t a kit. It’s not even a pattern I’ve bought and then found my own materials for, like most of the crochet projects. No, for the first time, this is a pattern I created myself. I saw something I thought would make a good cross stitch project, turned it (with the help of software) into a chart, and got started.

The start of a new cross stitch project

Because this isn’t something that was designed specifically for cross stitch by a specialist cross stitch designer, it does use quite a lot of colours, and it’s going to be a bit more complex than pretty much all of the cross stitch kits I’ve tried so far. Because of that, for the first time, I’ve actually started crossing off each of the stitches on the pattern as I do it—it helps that I know I can always print another copy off, of course. It is definitely going to help the further into this I get, though, especially when I get to the parts of the design which include lots of small areas of different colours, or the parts with lots of confetti—the cross stitch term for single isolated stitches scattered one-at-a-time across the background. This project will have a lot of confetti.

Crossing things off as I go

It will be some months before the whole thing is finished, even though it’s not full coverage, and even if I did deliberately avoid including any backstitch as part of the design. For now, though, new project energy is carrying me bowling along at pace. Only a week in, and already I’ve done a good chunk of the pattern’s central, focal point.

Progress, as of yesterday

That’s quite a good chunk of stitching for one week’s spare evening moments. What is it, you ask? Well, to know that…if you don’t recognise it, you’ll just have to wait and find out.

To read the next post about this project, follow this link

Refactoring

Or, making the site more efficient

Back in March, I wrote about making my post publishing process on this blog a bit simpler. Well; that was really just a side effect. The main point of that post, and the process behind it, was to find a simple and cheap way to move this site onto HTTPS-based hosting, which I accomplished with an Azure Static Web App. The side effect was that the official way to deploy an Azure Static Web App is via Microsoft Oryx, run from a GitHub Action. So now, when I write a new post, I have a fairly ordinary workflow similar to what I’d use (and do use!) in a multi-developer team. I create my changes in a Git branch, create a GitHub pull request, merge that pull request, and the act of doing a merge kicks off a GitHub Action pipeline that fires up Oryx, runs Wintersmith, and produces a site image which Oryx then uploads to Azure. Don’t be scared of all the different names of all the steps: for me, it’s just a couple of buttons that sets off a whole Heath Robinson chain of events. If I was doing this in a multi-person team, the only real difference would be to get someone else to review the change before I merge it, just to make sure I haven’t said something completely stupid.

You, on the other hand, are getting me unfiltered.

I mentioned in that previous post that Oryx would often give me a very vague Failure during content distribution error if the content distribution step—the step that actually uploads the finished site to Azure—hit a five-minute timeout. I tried to address this, at least partially, by cutting the size and number of the tag pages; and it did address it, partially. Not all of the time though. After an evening of trying to deploy a new post for an hour or so, hitting the timeout each time and trying again, I decided I had to come up with a better approach. What I came up with, again, has another rather nice side effect.

A little bit of digging around what other people facing the Failure during content distribution error had written, unearthed a useful tidbit of info. That timeout doesn’t just happen when the site is large in size. It also is more likely to happen when an upload contains a lot of changes.

Now, every page on this site has a whole bunch of menus. If you’re on desktop, they’re over to your right; if you’re on mobile, they’re down at the bottom of the page somewhere. There’s articles filed by category and articles filed by date. There’s the cloud of all the tags I’ve used the most, and there’s links to other sites—I really should give that a refresh. Those blocks are on every page. The ones which link to other pages include a count of articles in that category or month, so you know what you’re letting yourself in for. The tag cloud’s contents shift about occasionally, depending on what I’ve written. The end result is that, when I was adding a new post to the site, every single page already on the blog had to be rewritten. For example, this is the first (and only) post from May 2024, so every single page already on the site (all 4,706 of them), had to be rewritten to add a “May 2024 (1)” line at the top of the “Previously…” section. That’s about 84% of the files on the blog, changing, to add one new post.

However…it doesn’t have to be like that.

The whole site doesn’t have to be completely static. It can still remain server-side static, if you’re willing to rely on a bit of client-side JavaScript; most people, the vast majority of people, have JavaScript available. Instead of including those menus in every page, I thought, why not render those menus once, and have a wee piece of JavaScript on each page that pulls in those blocks on demand?

It wasn’t that hard to do. Rendering the files just needed me to pull those blocks out of the main layout template and into their own files. The JavaScript code to load them is all of 11 lines long, and that’s if you write it neatly; it really just initiates a HTTP GET call and when the results come back, inserts them onto the right place on the page. There’s a sensible fallback option that lets you click through to a truly-static version of each menu, just in case you’re having problems—those largely already existed, but weren’t really being used. Now, adding one new post needs, at the moment, just over a hundred files to change. Most of those are the hundred-ish files that make up the “main sequence” of posts, as when you add one at the top, another drops off the bottom and on to the next page, and so on all the way down. There are also the affected category and month pages. Even so, you’re going from changing ~84% of the pages on the site, to changing somewhere around 2-5%. That’s a massive difference. It also reduces the size of the site quite a lot too: those menus are over 12kb of code, all together. Not very much by modern standards, just once; but repeated on every page of the site, that was using up about 58Mb of space which has now been clawed back.

Naturally, the first deployment of the new system took a few goes to work, because it was still changing every page on the site. Since, though, deployments have gone completely smoothly, and the problem hasn’t come back once. Hopefully, things will stay that way.

This isn’t the only improvement I’ve been working on, by the way. There is, upcoming, another big change to how the site is published. It isn’t quite ready to go live yet, though. I’ll be blogging about it when it reaches production, when I find enough free time to get it finished. It’s something I’m really pleased with, even though if I didn’t tell you, you wouldn’t actually notice a thing. You’ll just have to wait for the next meta-blog post about engineering on the site, to find out what I’m working on.

Typecasting

In which Caitlin is at risk of acquiring a new hobby

One stereotypical nerd gadget I’ve never seen the point of, that I always assumed was the nerd equivalent of hand-woven gold hi fi cables, was the mechanical keyboard. I assumed they were, as the phrase goes, fidget spinners for IT geeks. Something that is expensive and makes lots of fun clicky-clacky noises, but doesn’t actually change your computing experience by one tiny bit.

Well, reader, I was wrong. I admit it. Completely, absolutely, 100 per cent wrong. Switching to a mechanical keyboard has been one of the best productivity improvements I could have made to my workplace. Since I started using one, my typing has speeded up enormously. It’s definitely not just a toy. Having a decent length of travel on each key movement somehow genuinely makes it much easier and quicker for me to type; and also makes my typing a lot more confident. I’ve never learned to type properly, and I still make a lot of mistakes, but in general I’m finding my fingers skip across the keys much more freely.

This first started last summer when I was already tempted by the idea, and saw that a fairly cheap model already had been reduced quite a lot on sale. So, I bought it. And, if nothing else, it was pretty. It glowed, with rainbow light. It came with a choice of beige or purple keycaps, so being contrary I naturally changed just half of them over, trying to get a dithering kind of effect from beige on the left to purple on the right. It kind of worked. Typing, though, was excellent.

The mixed keycaps of my first mechanical keyboard, with shine-through legends on the keys

I felt like I was typing much better than I ever had on laptop keyboards, but there was something wrong. Still, I resisted the temptation to be a keyboard nerd. An enthusiast. One keyboard would be enough for me.

The problem with the first keyboard was that it was only a 60% model. In other words, it only has about 60% of the keys of a “full” PC keyboard; just the core letters and numbers really. To get all the other functions, you need a modifier key. A lot of laptops do that to access extra functions or squeeze all of the keys into a laptop case, but this was using it for fairly basic functionality like the four cursor keys. When coding, I find myself moving around with them a lot, so having to chord to use them quickly became annoying. On top of that there were other little problems: the Bluetooth connection would sometimes glitch out, particularly if the battery was low. When the battery ran low the only warning was one of the modifier keys flashing, and then when you charged it up there was no sign of how charged it was. On the good side, its small size made it nice and portable. Overall, it was a good starter.

After a few months, I’d decided it was time to think about buying a full-size mechanical keyboard. And why not go all in and just buy a “barebones” model. A barebones keyboard is, well, not really a keyboard at all. It’s the core of a keyboard, but it doesn’t have any keys. You have to fit it out with keyswitches and keycaps for it to work. When it arrived, it was very nicely-packaged, it felt very substantial, solid and heavy, but I couldn’t actually start using it.

The new barebones keyboard, a Keychron K10, without any switches or keycaps

It’s a Keychron K10 model, and all you have to do to get it working is push switches into each of those sockets. You get to choose the brand of keyswitch you want, though, and switch manufacturers publish complex charts of the response and movement of different types of switch, describing them as “soft”, “firm”, “clicky” and so on. I just went for a fairly soft switch from a well-known brand, and set to work plugging them all in. It was quite a therapeutic job, pushing each switch home until it is firmly in place.

Plugging switches into the keyboard.  If I'd been planning to blog about this I'd have done my nails first

All the switch sockets nicely filled in

The harder part is choosing the keycaps: harder, because as well as how they feel, they have to look pretty too, and there are an innumerable assortment of manufacturers who will sell you pretty keys. And in the end, I just couldn’t decide, so went with a set of plain black “pudding” keycaps. “Pudding” keycaps have a solid, opaque top but translucent sides, so the backlights on each key shine nicely through. I’m not sure they are the right keycaps for me long-term, but they were a nice and cheap “first set”.

The finished keyboard with pudding keycaps

Am I going to turn into a keyboard nerd? Well, I’ve already tweaked it a wee bit. I kept hitting the “Insert” key by accident, not being used to having a key there, so I’ve already changed the switch on that specific key to be a much firmer, clickier one, so that at least when I do hit it by accident I notice I’ve done it. I’ll probably change the keycaps for something prettier at some point, something a bit more distinctive. I’m not going to go out and buy a lot more keyboards, because I already think this one is very nice to type on. It has a sensible, useful power lamp that flashes when the battery’s low, is red when it’s charging and goes green when it’s finished. But, overall: I admit I was wrong. This is much, much nicer to type on—I’m writing this post on it now—than a standard laptop keyboard is. For something I’ll use pretty much every day that I’m at home, it’s definitely worth the money.

Modern technology

Or, keeping the site up to date

Well, hello there! This site has been on somethng of a hiatus since last summer, for one reason and another. There’s plenty to write about, there’s plenty going on, but somehow I’ve always been too busy, too distracted, too many other things going on to sit down and want to write a blog post. Moreover, there are more technical reasons that I’ve felt I needed to get resolved too.

This site has never been a “secure site”. By that I mean, the connection between the website’s server and your browser has never been encrypted; anyone with access to the network in-between can see what you’re looking at. Alongside that, there’s no way for you to be certain that you’re looking at my genuine site, that the connection from your browser or device is actually going to me, not just to someone pretending to be me. Frankly, I’d never thought, for the sort of nonsense I post here, that it was very important. You’re not going to be sending me your bank details or your phone number; since the last big technical redesign, all of four years ago now, you haven’t been able to send me anything at all because I took away the ability to leave a comment. After that redesign was finished, “turn the site into a secure site” was certainly on the to-do list, but never very near the top of it. For one thing, I doubt anyone would ever want to impersonate me.

That changed a bit, though, in the last few months. There has been a concerted effort from the big browser companies to push users away from accessing sites that don’t use encryption. This website won’t shop up for you in search results any more. Some web browsers will show you an error page if you go to the site, and you have to deliberately click past a warning telling you, in dire terms, that people might interfere with your traffic or capture your credit card number. That’s not really a risk for this site, but the general trend has been to push non-technical users towards thinking that all non-encrypted sites are all extremely dangerous to the same degree. It might be a bit debateable, but it’s easy and straightforward for them to do, and it does at least avoid any confusion for the users, avoids them having to make any sort of value judgement about a technical issue they don’t properly understand. The side effect: it puts a barrier in front of actually viewing this site. To get over that barrier, I’d have to implement TLS security.

After I did the big rewrite, switching this site over from Wordpress to a static site generator back in 2020, I wrote a series of blog posts about the generation engine I was using and the work pipeline I came up with. What I didn’t talk about very much was how the site was actually hosted. It was hosted using Azure Storage, which lets you expose a set of files to the internet as a static website very cheaply. Azure Storage supports using TLS encryption for your website, and it supports you hosting it under a custom domain like symbolicforest.com. Unfortunately, it doesn’t very easily let you do both at the same time; you have to put a Content Delivery Network in front of your Storage container, and terminate your TLS connection on the CDN. It’s certainly possible to do, and if this was the day job then I’d happily put the parts together. For this site, though, a weird little hobby site that I don’t sometimes update for months or years at a time, it felt like a fiddly and expensive way to go.

During the last four years, though, Microsoft have introduced a new Azure products which falls somewhere in-between the Azure Storage web-hosting functionality and the fully-featured hosting of Azure App Service. This is Azure Static Web Apps, which can host static files in a similar way to Azure Storage, but with a control panel interface more like Azure App Service. Moreover, Static Web Apps feature TLS support for custom domains, out of the box, for free. This is a far cry from 20-something years ago, when I remember having to get a solicitor to prove my identity before someone would issue me with a (very expensive) TLS certificate; according to the documentation, it Just Works with no real configuration needed at all. Why not, I thought, give it a bit of a try?

With Azure Storage, you dump the files you want to serve as objects in an Azure Blob Storage container and away you go. With an App Service, you can zip up the files that form your website and upload them. Azure Static Web Apps are a bit more complex than this: they only support deployment via a CI/CD pipeline from a supported source repository hosting service. For, say, Github, Azure tries to automate it as much as possible: you link the Static Web App to your Github account, specify the repository, and Azure will create an Action which is run on updates to the main branch, and which uses Microsoft Oryx to build your site and push the build artefacts into the web app. I’m sure you could manually pull apart what Oryx does, get the web app’s security token from the Azure portal, and replicate this whole process manually, but the goal is clearly that you use a fully automated workflow.

My site had never been set up with an automated workflow: that was another “nice to have” which had never been that high on the priority list. Instead, my deployment technique was all very manual: once I had a version of the site I wanted to deploy in my main branch—whose config was set up for local running—I would merge that branch into a deploy branch which contained the production config, manually run npm run clean && npm run build in that branch, and then use a tool to upload any and all new or changed files to the Azure Storage container. Making sure this all worked inside a Github Action took a little bit of work: changing parts of the site templates, for example, to make sure that all paths within the site were relative so that a single configuration file could handle both local and production builds. I also had to make sure that the top-level npm run build script also called npm install for each subsite, including for the shared Wintersmith plugins, so that the build would run on a freshly-cloned repository without any additional steps. With a few other little tweaks to match what Oryx expected—such as the build output directory being within the source directory instead of alongside it—everything would build cleanly inside a Github action runner.

It was here I hit the major issue. One of the big attractions of Azure Static Web Apps is that they’re free! Assuming you only want a personal site, with a couple of domain names, they’re free! Being from Northern England, I definitely liked that idea. However, free Static Web Apps also have a size limit of 250Mb. Oryx was hitting it.

This site is an old site, after all. There are just over a thousand posts on here, at the time of writing,* some of them over twenty years old. You can go back through all of them, ten at a time, from the home page; or you can go through them all by category; or month by month; or there are well over 3,000 different tags. Because this site is hosted through static pages, that means the text of each post is repeated inside multiple different HTML files, as well as each post having its own individual page. All in all, that adds up to about 350Mb of data to be hosted. I have to admit, that’s quite a lot. An average of 350Kb or so per post—admittedly, there are images in there which bump that total up a bit.

In the short term, this is fixable, in theory. Azure Static Web Apps offer two Hosting Plans at present. The free one, with its 250Mb limit, and a paid one. The paid one has a 500Mb limit, which should be enough for now. In the longer term, I might need to look at solutions to reduce the amount of space per post, but for now it would work. It wasn’t that expensive, either, so I signed up. And found that…Oryx still fell over. Instead of clearly hitting a size limit, I was getting a much vaguer error message. Failure during content distribution. That’s not really very helpful; but I could see two things. Firstly, this only occurred when Oryx was deploying to my production environment, not to the staging environment, so the issue wasn’t in my build artefacts. Secondly, it always occurred just as the deployment step passed the five-minute-runtime mark—handily, it printed a log message every 15 seconds which made that nice and easy to spot. The size of the site seemed to be causing a timeout.

The obvious place to try to fix this was with the tag pages, as they were making up over a third of the total file size. For comparison, all of the images included in articles were about half, and the remaining sixth, roughly speaking, covered everything else including the individual article pages. I tried cutting the article text out of the tag pages, assuming readers would think to click through to the indivdual articles if they wanted to read them, but the upload still failed. However, I did find a hint in a Github issue, suggesting that the issue could also occur for uploads which changed lots of content. I built the site with no tag pages at all, and the upload worked. I rebuilt it with them added in again, and it still worked.

Cutting the article text out of the tag pages has only really reduced the size to about 305Kb per post, so for the long term, I am definitely going to have to do more to ensure that I can keep blogging for as long as I like without hitting that 500Mb size limit. I have a few ideas for things to do on this, but I haven’t really measured how successful they will be likely to be. Also, the current design requires pretty much every single page on the site to change when a new post is added, because of the post counts on the by-month and by-category archive pages. That was definitely a nuisance when I was manually uploading the site after building it locally; if it causes issues with the apparent 5-minute timeout, it may well prove to be a worse problem for a Static Web App. I have an idea on how to work around this, too; hopefully it will work well.

Is this change a success? Well, it’s a relatively simple way to ensure the site is TLS-secured whilst still hosting it for a relatively cheap cost, and it didn’t require too much in the way of changes to fit it in to my existing site maintenance processes. The site feels much faster and more responsive, subjectively to me, than the Azure Storage version did. There are still more improvements to do, but they are improvements that would likely have been a good idea in any case; this project is just pushing them further to the top of the heap. Hopefully it will be a while before I get near the next hosting size limit; and in the meantime, this hosting change has forced me to improve my own build-and-deploy process and make it much smoother than it was before. You never know…maybe I’ll even start writing more blog posts again.

* If I’ve added up correctly, this is post 1,004.

No more cookies!

Or, rather, no more analytics

Regular readers—or, at least, people who have looked at this site before the last month or two—might remember that it used to have a discreet cookie consent banner at the top of the page, asking if you consented to me planting a tracking cookie that I promised not to send to anyone else. It would pop up again about once a year, just to make sure you hadn’t changed your mind. If you clicked yes, you appeared on my Google Analytics dashboard. If you clicked no, you didn’t.

What you probably haven’t noticed is that it isn’t there any more. A few weeks ago now, I quietly stripped it out. This site now puts no cookies of any sort on your machine, necessary or otherwise, so there’s no need for me to ask to do it.

When I first started this site’s predecessor, twenty-something years ago, I found it quite fascinating looking at the statistics, and in particular, looking at what search terms had brought people to the site. If you look back in the archives, it used to be a common topic for posts: “look what someone was searching for and it led them to me!” What to do when you find a dead bat was one common one; and the lyrics to the childrens’ hymn “Autumn Days When The Grass Is Jewelled”. It was, I thought—and I might not have been right about this—an interesting topic to read about, and it was certainly a useful piece of filler back in the days of 2005 when I was aiming to publish a post on this site every day, rather than every month. If you go back to the archives for 2005, there’s a lot of filler.

Now, though? Hopefully there’s not as much filler on the site as there was back then. But the logs have changed. Barely anything reaches this site through “organic search” any more—“organic search” is the industry term for “people entering a search phrase in their browser and hitting a link”. Whether this means Google has got better or worse at giving people search results I don’t know—personally, for the searches I make, Google has got a lot worse for the sort of searches where I didn’t know what site I wanted to go to beforehand, but for the sort of lazy searches where I already know where I want to go, it’s got better. I suspect the first sort were generally the sort that brought people here. Anyway, all the traffic to this site comes from people who follow me on social media so follow the link when I tell them there’s a new blog post up.

Given that the analytics aren’t very interesting, I hadn’t looked at them for months. And, frankly, do I write this site in order to generate traffic to it? No, I dont. I write this site to scratch an itch, to get things off my chest, because there’s something I want to say. I write this site in order to write this site, not to drive my income or to self-promote. I don’t really need a hit counter in order to do that. Morover, I realised that in all honesty I couldn’t justify the cutesy “I’m only setting a cookie to satisfy my own innate curiosity” message I’d put in the consent banner, because although I was just doing that, I had no idea what Google were doing with the information that you’d been here. The less information they can gather on us, the better. It’s an uphill struggle, but it’s a small piece in the jigsaw.

So, no more cookies, no more consent banner and no more analytics, until I come up with the itch to write my own on-prem cookie-based analytics engine that I can promise does just give me the sort of stats that satisfy my own nosiness—which I’m not likely to do, because I have more than enough things ongoing to last me a lifetime already. This site is that little bit more indie, that little bit more Indieweb, because I can promise I’m not doing anything at all to harvest your data and not sending any of it to any third parties. The next bit to protect you will be setting up an SSL certificate, which has been on the to-do list for some months now; for this site, given that you can’t send me any data, all SSL will really do is guarantee that I’m still me and haven’t been replaced, which isn’t likely to be anything you’re particularly worried about. It will come, though, probably more as a side-piece to some other aspect of improving the site’s infrastructure than anything else. This site is, always has been, proudly independent, and I hope it always will be.

Ongoing projects

As soon as something finishes, I start two more

The crafting project I mentioned in my last post is finished! Well, aside from blocking it and framing it, that is.

An actually completed cross stitch project of a Gothenburg tram

Me being me though, I couldn’t resist immediately starting two more. And then, of course, there’s the videos still to produce. I will get to the end of the list, eventually. In the meantime, here’s some photos of a few of the things in progress.

An in-progress Lego project all set up for filming

An in-progress crochet creation; this photo is from a few months ago but I still haven't produced the video about it

Frame from another in-progress Lego build which will probably be the first of these to hit YouTube

At some point, I promise, all of these projects will be complete and will have videos to go with them! Better make a start…

Self-promotion

A couple of Yuletide videos

It’s still the Yuletide season, although we’re now very much into the time-between-the-years when everybody is grazing on snacks and leftovers, has battened down the hatches against the storms, and has completely forgotten what day of the week it is.*

As it is still Yuletide, though, I thought I’d post a couple of the seasonal Lego videos I put up on my YouTube channel last week, before the holiday season had really got under way. Both of them are Lego build videos, for some seasonal sets that I picked up earlier in the month.

Firstly, a “Winter Holiday Train”…

…and, secondly, Santa’s Workshop

In a few days they’ll be going away ready for next year, but for now, I hope you enjoy them whilst you’re still feeling a little bit seasonal!

* No, don’t ask me either.

Know your limits!

Or remember that computers are still not boxes of infinite resource, whatever you might think

Sometimes, given that I often work with people who are twenty years or so younger than me, I feel old. I mean, the archives of this blog go back over twenty years now: these are serious, intelligent colleagues, and when I started writing my first blog posts they were likely still toddlers.

Sometimes, though, that has an advantage. I was thinking of this when debugging some code a colleague had written, which worked fine up to a point, but failed if its input file was more than, say, a few tens of megabytes. When the input reached that size, the whole thing crashed with OutOfMemoryException even on a computer with multiple gigabytes of memory, a hundred times more memory than the hundred-megabyte example file the client had sent.

When I was younger, you see, that would have seemed a ridiculous amount of data, unimaginable to fit in one file. Even when I had my first PC, the thought of a file too big to fit on even a superfloppy like a Zip disk was a little bit mindblowing, even though the PC seemed massive compared to what I’d experienced before.

Back when I was at school, I’d tried to teach myself how to code on an Amstrad CPC, a mid-1980s 8-bit machine with a 64k address space and a floppy disk drive of 180k capacity. It was the second-generation of 8-bit home machine really, more powerful than a C64 or a Sinclair Spectrum despite sharing the same CPU as the latter. Unlike those, it had a fully-bitmapped screen with individual pixels all fully addressable; however, that took up 16k of the 64k address space, so the actual code on it had to be pretty damn tight to fit. The programmers’ Firmware Manual—what we’d now call the API reference documentation—is of course scanned and online; one of the reasons I was never very succssful coding on the machine itself* was that in the 1980s and 90s copies of it were almost impossible to find once Amstrad’s print run was exhausted. On the CPC, every byte you used counted; a lot of software development houses ended up cross-assembling their code purely because for a large program it was difficult to fit the source code itself onto the machine.** That’s the background I came from, and it makes me wary still nowadays not to waste too much memory or resources. I’m the sort of developer who will pass an expected size parameter to the List<T> constructor if it’s known, to avoid unnecessary reallocations, who doesn’t add ToList() automatically by reflex to the end of every LINQ operation—which is a good idea in any case, as long as you know when you do need to.

Returning to the present: what had my team member done, then, that he was provoking a machine into running out of memory when in theory he had plenty to play with? Well, there were two problems at work.

Firstly, yes, we’re talking about someone who has never tried building code on a tiny tiny environment. The purpose of this particular code was to take an input zip file, open it, modify some of its content, recompress it, and send it off to an API elsewhere. Moreover, this had been done re-using existing internal code, some of which wanted to operate on a Stream and some of which, for whatever reason I don’t know, wanted to operate on a byte[]. We had ended up with code that received the data in a MemoryStream, unzipped it in memory, and copied the contents out into more MemoryStream objects. Each of those was being copied into a byte array which was being passed to a routine that immediately copied its input into a new MemoryStream, before deserializing…well, you get the idea. The whole thing ended up with many, many copies of the input data in memory, either in essentially its original format, or in a slightly modified form, and all of these copies were still in memory at the end of the process.

Secondly, there was another issue that was not quite so much the developer’s responsibility. This .NET code was being combined in “Portable” form, and the server was, again for reasons best known to itself, deciding that it should run it with the 32-bit runtime. Therefore, although there should have been 16Gb of memory on the server instance, we were working with a 2Gb memory ceiling.

I did dig in and rewrite as much of the code as I thought I needed to. Some of the copying could be elided altogether; and as this wasn’t a time-critical piece of code, I changed a lot of the rest to use a temporary file instead of memory. The second issue had an easy, lazy fix: compile the thing as 64-bit only, so the server would have no choice of runtime. As a result I never did get to the bottom of why it was preferring the 32-bit runtime, but I had working, shippable, code at the end of the day, and that’s what mattered here.

What I couldn’t help thinking, though, was that the rewriting might not have been needed to begin with. A young developer—who’s never worked on a genuinely small system—has spent so much time, though, never worrying about working anywhere near the boundaries of what their virtual servers can cope with, that when they do hit those boundaries, it comes as a nasty, sudden shock. They have no idea at all what to do, or even where to start: an OutOfMemoryException may as well be an act of the gods. Maybe when I’m helping train people up, I should give them all an Amstrad CPC emulator and see what the result is.

* My high point was successfully cloning Minesweeper, but with keyboard controls.

** Some software was shipped on 16k ROMs, to go along with third-party ROM socket boxes that attached to the expansion bus; this kept the assembler and editor code out of the main address space, but it could still be difficult to fit the source code and the assembler output in memory at the same time. The ROMs were scanned on boot and each declared named entrypoints which could then be accessed as BASIC commands. At least one game I can remember—The Bard’s Tale—crashed if too many ROMs were attached, because each ROM could reserve an area of RAM for its own bookkeeping, and the game found itself without enough memory available.