+++*

Symbolic Forest

A homage to loading screens.

Blog : Post Category : Technology : Page 1

Going through things one by one

Or, a coding exercise

One of my flaws is that as soon as I’m familiar with something, I assume it must be common knowledge. I love tutoring and mentoring people, but I’m bad at pitching exactly where their level might be, and in working out what they might not have come across before. Particularly, in my career, software development is one of those skills where beyond a certain base level nearly all your knowledge is picked up through osmosis and experience, rather than through formal training. Sometimes, when I’m reviewing my team’s code I come across things that surprise me a little. That’s where this post comes from, really: a few months back I spotted something in a review and realised it wouldn’t work.

This post is about C#, so apologies to anyone with no interest in coding in general or C# in particular; I’ll try to explain this at a straightforward level, so that even if you don’t know the language you can work out what’s going on. First, though, I have to explain a few basics. That’s because there’s one particular thing in C# (in .NET, in fact) that you can’t do, that people learn very on that you can’t do, and you have to find workarounds for. This post is about a very similar situation, which doesn’t work for the same reason, but that isn’t necessarily immediately obvious even to an experienced coder. In order for you to understand that, I’m going to explain the well-known case first.

Since its first version over twenty years ago, C# has had the concept of “enumerables” and “enumerators”. An enumerable is essentially something that consists of a set of items, all of the same type, that you can process or handle one-by-one. An enumerator is a thing that lets you do this. In other words, you can go to an enumerable and say “can I have an enumerator, please”, and you should get an enumerator that’s linked to your enumerable. You can then keep saying to the enumerator: “can I have the next thing from the enumerable?” until the enumerator tells you there’s none left.

This is all expressed in the methods IEnumerable<T>.GetEnumerator()* and IEnumerator<T>.MoveNext(), not to mention the IEnumerator<T>.Current property, which nobody ever actually uses. In fact, the documentation explicity recommends you don’t use them, because they have easier wrappers. For example, the foreach statement.

List<string> someWords = new List<string>() { "one", "two", "three" };
foreach (string word in someWords)
{
    Process(word);
}

Under the hood, this is equivalent** to:

List<string> someWords = new List<string>() { "one", "two", "three" };
IEnumerator<string> wordEnumerator = someWords.GetEnumerator();
while (wordEnumerator.MoveNext())
{
    string word = wordEnumerator.Current;
    Process(word);
}

The foreach statement is essentially using a hidden enumerator that the programmer doesn’t need to worry about.

The thing that developers generally learn very early on is that you can’t modify the contents of an enumerable whilst it’s being enumerated. Well, you can, but your enumerator will be rendered unusable. On your next call to the enumerator, it will throw an exception.

// This code won't work
List<string> someWords = new List<string>() { "one", "two", "three" };
foreach (string word in someWords)
{
    if (word.Contains('e'));
    {
        someWords.Remove(word);
    }
}

This makes sense, if you think about it: it’s reasonable for an enumerator to be able to expect that it’s working on solid ground, so to speak. If you try to jiggle the carpet underneath it, it falls over, because it might not know where to step next. If you want to do this using a foreach, you will need to do it some other way, such as by making a copy of the list.

List<string> someWords = new List<string>() { "one", "two", "three" };
List<string> copy = someWords.ToList();
foreach (string word in copy)
{
    if (word.Contains('e'));
    {
        someWords.Remove(word);
    }
}

So, one of my colleagues was in this situation, and came up with what seemed like a nice, clean way to handle this. They were going to use the LINQ API to both make the copy and do the filtering, in one go. LINQ is a very helpful API that gives you filtering, projection and aggregate methods on enumerables. It’s a “fluent API”, which means it’s designed for you to be able to chain calls together. In their code, they used the Where() method, which takes an enumerable and returns an enumerable containing the items from the first enumerable which matched a given condition.

// Can you see where the bug is?
List<string> someWords = new List<string>() { "one", "two", "three" };
IEnumerable<string> filteredWords = someWords.Where(w => w.Contains('e'));
foreach (string word in filteredWords)
{
    someWords.Remove(word);
}

This should work, right? We’re not iterating over the enumerable we’re modifying, we’re iterating over the new, filtered enumerable. So why does this crash with the same exception as the previous example?

The answer is that LINQ methods—strictly speaking, here, we’re using “LINQ-To-Objects”—don’t return the same type of thing as their parameter. They return an IEnunerable<T>, but they don’t guarantee exactly what implementation of IEnumerable<T> they might return. Moreover, in general, LINQ prefers “lazy evaluation”. This means that Where() doesn’t actually do the filtering when it’s called—that would be a very inefficient strategy on a large dataset, because you’d potentially be creating a second copy of the dataset in memory. Instead, it returns a wrapper object, which doesn’t actually evaulate its filter until something tries to enumerate it.

In other words, when the foreach loop iterates over filteredWords, filteredWords isn’t a list of words itself. It’s an object that, at that point, goes to its source data and thinks: “does that match? OK, pass it through.” And the next time: “does that match? No, next. Does that match? Yes, pass it through.” So the foreach loop is still, ultimately, triggering one or more enumerations of someWords each time we go around the loop, even though it doesn’t immediately appear to be used.

What’s the best way to fix this? Well, in this toy example, you really could just do this:

someWords = someWords.Where(w => !w.Contains('e')).ToList();

which gets rid of the loop completely. If you can’t do that for some reason—and I can’t remember why we couldn’t do that in the real-world code this is loosely based on—you can add a ToList() call onto the line creating filteredWords, forcing evaluation of the filter at that point. Or, you could avoid a foreach loop a different way by converting it to a for loop, which are a bit more flexible than a foreach and in this case would save memory slightly; the downside is a bit more typing and that your code becomes prone to subtle off-by-one errors if you don’t think it through thoroughly. There’s nearly always more than one way to do something like this, and they all have their own upsides and downsides.

I said at the start, I spotted the issue here straightaway just by reading the code, not by trying to run it. If I hadn’t spotted it inside somebody else’s code, I wouldn’t even have thought to write a blog post on something like this. There are always going to be people, though, who didn’t realise that the code would behave like this because they hadn’t really thought about how LINQ works; just as there are always developers who go the other way and slap a ToList() on the end of the LINQ chain because they don’t understand how LINQ works but have come across this problem before and know that ToList() fixed it. Hopefully, some of the people who read this post will now have learned something they didn’t know before; and if you didn’t, I hope at least you found it interesting.

* Note. for clarity I’m only going to use the generic interface in this post. There is also a non-generic interface, but as only the very first versions of C# didn’t support generics, we really don’t need to worry about that. If you write your own enumerable you’re still required to support the non-generic interface, but you can usually do so with one line of boilerplate: public IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();

** In recent versions of C#, at any rate. In earlier versions, the equivalence was slightly different. The change was a subtle but potentially breaking one, causing a change of behaviour in cases where the loop variable was captured by a lambda expression.

The Paper Archives (part three)

The title of this series is maybe not quite as suitable as it was

The previous post in this series is here.

Sometimes, sorting through the accumulated junk that fills my mother’s house, I come across things that I remember from my childhood. For example: alongside the stack of modern radio transceivers that my dad used to speak to random strangers over the airwaves, is the radio I remember being my Nanna’s kitchen radio, sitting on top of the fridge.

The old kitchen radio

It’s a big, clunky thing for a portable, its frame made of leather-covered plywood. I know it has valves (or tubes) inside, not transistors, because I remember my dad having to source spare valves for it and plug them in back when my Nanna still used it daily—he was the only person in the family who knew how to work out which of the valves had popped when it stopped working.

With only a vague idea how old it might be, I looked at the tuning dial to see if it would give me any clues.

The tuning dial

Clearly from before the Big BBC Renaming of the late 1960s. I’m not sure how much it can be trusted for dating, though, as Radio Athlone officially changed to Radio Éireann in the 1930s, but I was fairly sure the radio probably wasn’t quite that old. Of course, I should really have beeen looking at the bottom.

The makers' plate

And of course the internet can tell you exactly when a Murphy BU183M was first sold: 1956, a revision of the 1952 BU183, which had the same case. The rather more stylish B283 model came out the following year, so I suspect not that many of the BU183M were made.

I’m intrigued by the wide range of voltages it can run off: nowadays that sort of input voltage range is handled simply and automatically by power electronics, but in the 1950s you had to open your radio up and make sure the transformer was set correctly before you tried to plug it in, just in case you were about to blow yourself up otherwise. I suppose this is what radio shops were for, to do that for you, and potentially to hire out the large, chunky high-voltage batteries you might need if you didn’t have mains electricity. This radio is from the last years of the valve radio: low-voltage transistor sets were about to enter the marketplace and completely change how we listened to music. This beast—or the B283, which at least looks like an early transistor radio—needed a 90-volt battery to heat up the valves if you wanted to run them without mains power, not the sort of battery you can easily carry around in your handbag. The world has changed a lot in seventy years.

State of independence

Or, getting the web back to its roots

When I rewrote and “relaunched” this site, back in 2020, I very consciously chose to stay simple. I didn’t want to tie myself to one of the major “content platforms”, because over the years too many of them have closed down on barely more than a whim. I didn’t want a complex system that would be high-maintenance in return for more functionality. I didn’t want to have to moderate what other people might want to say in my space. More importantly, though, I did want a space more like the online spaces I inhabited 20 or so years ago; or at least, like the online spaces of my imagination, where people would create in their own little corner not worrying about influence or monetisation or that sort of thing. It’s possible that place never really existed, except in my mind, but it was something I always aspired towards, and it was a place where I met a whole load of other people who shared a similar outlook on why they were writing down so much stuff out there on the internet for other people to read. That was why, when I rewrote this site, I kept it simple, and produced a static site that could be hosted almost anywhere, with source code that can be put into any private Git hosting service. I didn’t even go for one of the mainstream static site generators; I chose a relatively simple and straightforward open-source one that works by gluing a number of other open-source tools together to output HTML. It’s about as plain and independent as you can get.

There is, nowadays, a movement towards making the web more independent, making it more like it used to be, or at least as some of us remember it. It’s called the IndieWeb movement. The basic idea behind the IndieWeb is exactly this: that when you, an individual, post something online, it should stay yours. It should belong to you, under your control, forever. Essentially, that’s one of the main things I’ve always been aiming for.

I’m clearly IndieWeb-adjacent, whatever that phrase I’ve just invented means. This site, though, is a long way from being IndieWeb-complient. And the reason is: I’ve looked through their Getting Started pages, and, frankly, it takes effort. That might sound like me being lazy, and I’d be the first to agree that I am lazy, but it’s also because there are only so many hours in the day. The day job takes up a good chunk of them, of course, then there are The Children, there’s my other coding projects, all my craft projects,* the various organisations I do volunteer work for, all the other ways I’m trying to improve myself, not to mention the attraction of just going out for a long walk for a few hours. Aside from the original setup and occasional tweaks, this site is largely something to exercise the side of my brain that isn’t involved in coding. Spending time setting up and creating my own personal h-card, and automating syndication, isn’t really something I want to do in my relaxation hours.

Hopefully, though, the idea behind IndieWeb will grow, and will flourish, and we can make the web something that isn’t driven by advertising revenue, or by monetising hate and bigotry. I’d like us to make the web a place where seeds have space to germinate and flower, where everyone controls their own output and can express themselves without the point being to increase shareholder value or to feed the ego of some not-as-bright-as-he-thinks entrepreneur. Maybe I’ll add more IndieWeb features to this site, one by one, as time goes by. Hopefully, whatever I do, I’ll just keep doing my own thing for as longa as it makes me happy.

* I mean, I literally started two separate new ones yesterday.

The Paper Archives (part two)

More relics from the past

The previous post in this series is here.

Spending some more time going through the things The Parents should arguably have thrown out decades ago, I came across a leather bag, which seemed to have belonged to my father. Specifically, he seemed to have used it for going to college, in the 1970s. Him being him, he’d never properly cleaned it out, so it had accumulated all manner of things from all across the decade. There were “please explain your non-attendance” slips from 1972; an unread railway society magazine from 1977; and the most recent thing with a date on was an Open University exam paper from 1983. It was about relational database design, and to be honest some of the questions wouldn’t be out of place in a modern exam paper if you asked for the answers in SQL DDL rather than in CODASYL DDL, so I might come back to that and give it its own post. What he scored on the exam, I don’t know. There were coloured pencils, and an unopened packet of gum.

Juicy Fruit gum

It seems to be from before the invention of the Best Before date, but the RRP printed on the side is £0.04.

Slightly more expensive: a rather nice slide rule. Look, it has a Standard Deviation scale and all. Naturally, my dad being my dad, it was still in its case and with the original instruction book, which will be useful if I ever try to work out how to use it.

Slide rule

And finally (for today) I spotted what appeared to be a slip of paper at the bottom of the bag with “NEWTON’S METHOD” written on it in small capitals, in fountain-pen ink. Had he been cheating in his exams? Had he written a crib to the Newton-Raphson method down and slipped it into the bottom of the bag? I pulled it out and…I was wrong.

Paper tape

It was a rolled-up 8-bit paper tape! Presumably with his attempt at a program to numerically solve a particular class of equation using Newton’s method.

I don’t know what type of machine it would have been written for, but I could see that it was likely binary data or text in some unfamiliar encoding, as whichever way around you look at it a good proportion of the high bits would be set so it was unlikely to be ASCII. Assuming I’m holding the tape the right way round, this is a transcription of the first thirty-two bytes…

0A 8D 44 4E C5 A0 35 B8 0A 8D 22 30 A0 59 42 A0 47 4E C9 44 C9 56 C9 44 22 A0 D4 4E C9 D2 50 A0

That’s clearly not ASCII. In fact, I think I know what it might: an 8080/Z80 binary. I recognise those repeated C9 bytes: that’s the opcode for the ret instruction, which has survived all the way through to the modern-day x64 instruction set. If I try to hand-disassemble those few bytes assuming it’s Z80 code we get:

ld a,(bc)
adc a,l
ld b,h
ld c,(hl)
push bc
and b
dec (hl)
cp b
ld a,(bc)
adc a,l

This isn’t the place to go into Z80 assembler syntax—that might be a topic for the future—other than to say that it reads left-to-right and brackets are a pointer dereference, so ld c,(hl) means “put the value in register c into the memory location whose address is in register hl. As valid code it doesn’t look too promising to my eyes—I didn’t even realise dec (hl) was something you could do—but I’ve never been any sort of assembly language expert. The “code” clearly does start off making assumptions about the state of the registers, but on some operating systems that would make sense. This disassembly only takes us as far as the repeated 0A8D, though: maybe that’s some sort of marker separating segments of the file, and the actual code is yet to come. The disassembly continues…

ld (&a030),hl
ld e,c
ld b,d
and b
ld b,a
ld c,(hl)
ret
ld b,h
ret
ld d,(hl)
ret
ld b,h
ld (&a0d4),hl
ld c,(hl)
ret
jp nc,(&a050)

Well, that sort of makes some sort of sense. The instructions that reference fixed addresses all appear to point to a consistent place in the address space. It also implies code and data is in the same address space, in the block starting around &a000 which means you’d expect that some of the binary wouldn’t make sense when decompiled. If this was some other arbitrary data, I’d expect references like that to be scattered around at random locations. As the label says this is an implementation of Newton’s method, we can probably assume that this is a college program that includes an implementation of some mathematical function, an implementation of its first derivative, and the Newton’s method code that calls the first two repeatedly to find a solution for the first. I wouldn’t expect it to be so sophisticated as to be able to operate on any arbitrary function, or to work out the derivative function itself.

If I could find jumps or calls pointing to the instructions after those ret opcodes, I’d be happier. Maybe, if I ever have too much time on my hands, I’ll try to decompile the whole thing.

The next post in this series is here

Teaching an image to think

Computers work in unexpected ways

Following on from yesterday’s post about log4j: another security article fascinated me in the last week, too. You might have already seen it, because it was widely shared on Twitter and computer people everywhere were amazed and aghast at its engineering and its possibilities. The log4j vulnerability is a relatively pedestrian one by comparison, using something that is an entirely documented and public feature of the library. This, on the other hand, is a completely different animal.

It’s a hack which lets you run code on a stranger’s iPhone just by sending them a message. They don’t have to click on anything, they don’t even have to open it, all their phone has to do is receive it and the hacker can take their phone over. At least, could: the fix for this security hole was fixed three months ago in iOS 14.8 and later. If you are running an older version of iOS on your phone or tablet, then, er, maybe don’t. The analysis of how this hack works, by Google Project Zero, has started to be published; and if you’re a programming nerd, it is beautiful and amazing and horrific in just the same way that a biological virus is.

In short, this hack relied on the fact that an iOS device, when it receives an animated GIF, tries to hack the GIF a little so it will always loop forever whatever the GIF itself actually says to do. It does this in an unhealthy way, though. When it opens the file to change it, it doesn’t matter if it’s not actually a GIF. The software will try to be clever and say “ah, looks like your file’s got the wrong name there, don’t worry, I still know how to open one of these” and do it. Even if it’s not a GIF and therefore doesn’t really need to.

Secondly, the hack relies on a bug in an open source PDF-reading library, in the part of the code used to open embedded images that are in an obscure and rather out-of-date format mostly used by fax machines. PDF is a big, complex and rambly format (believe me I know, I’ve been on-off trying to write a .NET PDF writing library for some years now) so it’s not surprising there are bugs and holes in PDF-reading software. What this hack does, though, is frankly brilliant. It uses the capabilities of the compression algorithm of this particular graphics format to implement an entire virtual CPU in the memory of the target device. It’s a small CPU but it is a Turing-complete one, which in technical terms mean that if you ignore practical limits of time and memory, it’s just as powerful as any other computer. An entire virtual CPU…created by feeding a carefully-designed image into a buggy image decompression routine.*

Frankly, if you’re a software developer, this is genius. Evil genius, to be sure, but genius nonetheless. I’m somewhat in awe of it, in a dirty way. It’s a wonderful level of lateral thinking, to know that the bug is there to exploit and work out a way to reach it and trip it up to begin with; and then to build an entire virtual machine from the basic Boolean logic operations available inside a particular image format. As I said above, it’s beautiful, it’s amazing, and it’s horrific in the original sense of the word. It’s awe-inspiring. I might be good at my job, but I can only look upon this with amazement and envy.

* I assume the image itself looks like just so much white noise if you could actually view it, but you can’t have everything. It reminds me a little of Neal Stephenson’s early-90s novel Snow Crash, in which a carefully-designed image that looks like white noise can hack the viewer’s brain.

Some logical relief

In which we discuss a topical flaw

In many ways I lead a charmed life and hold a wide range of privileges in my hand. Not least, this week just gone, the fact that I’m a software developer who generally works with the .NET software stack. More specifically, I am not a software developer who works with Java. Java developers have not, generally speaking, been having a good week.

This is all because of a software vulnerability discovered just over a week ago in a Java library called “log4j”. To summarise, for non-experts: “log4j” is a logging library. No, not the let’s-clear-the-rainforests sort. “Logging” means your software writing diagnostic information as it goes along: records such as “user etoainshrdlu asked to see their bank balance at 9.10am from this address with that web browser”. You can see why…

Regular reader E Shrdlu (from Clacton) writes: Oi! You can’t go around giving my bank balance to people!

Hush now, I was just using you as an example! You can see why it’s useful to have this information stored away somewhere, and log4j is a software library that makes it really easy to do. Virtually all Java server-side code out there uses log4j somewhere inside it, to handle this sort of thing.

Unfortunately, log4j has a few handy features that were originally intended to be useful features, but aren’t necessarily a good idea to have running on an internet-facing server that does important work such as process your banking requests. Particularly, in this case, if you put a certain specialist type of URL into a log record, log4j will see it, try to download another program from it, and will then run that program in a certain well-defined way. Of course, you might say, there’s nothing wrong with that because all of the log record messages are just written by the bank’s own software developers, so everything’s perfectly safe. However, as I said above, one thing they may very well be logging is which browser you happen to be using, because that’s very useful diagnostic data if people start having problems. “Which browser you happen to be using”, though, is just a field that you send them, and if you know what you’re doing, you can change it to whatever you want to. Including a special type of URL which will…well, hopefully you get the picture. And now you’re running whatever programs you like on one of your bank’s internal servers. Ah. You can see now why Java developers have not been having a good week.

The fix for this is straightforward, but rolling the fix out will have involved a huge proportion of the Java code running in the world being checked, double-checked, and redeployed when it’s known to be safe. Moreover, all of the developers doing this will have had several queries a day from their managers asking just how much they are exposed to this issue. I know: I’ve had several myself, even though my response is straightforwardly “we don’t run any Java code at all, so don’t worry.” I do tell them to tell the clients we have thoroughly and conscientiously audited our systems because from a client-relations point of view it does sound a bit more professional than “no, and our tech lead is very glad of her career choices”. But it still means plenty of messages for me to answer.

Incidentally, I don’t feel any sort of schadenfreude about this, in case you were wondering. I genuinely feel sorry for a lot of people I know, who will not have had a good week fixing this stuff. I’ve worked in big banks and other similar organisations, and I know a lot of former colleagues and current friends who will have spent the last week focusing on this above all else. It’s not nice when you are suddenly bowled by a risk like this; and moreover, it’s not as if Java is uniquely likely to suffer from this type of problem. There are nuances to this that I may come back to in a later post; but next time something like this happens, the person fixing it might well be me.

Code archaeology

When things become relevant again

One thing I have been doing over the past few weeks is: finally, finally, taking the hard drive out of my last desktop computer—last used about 8 years ago at a guess—and actually copying all the documents off it. It also had stuff preserved from pretty much every desktop machine I’d had before that, so there was a whole treasure-chest of photographs I hadn’t seen in years, things I’d written, and various incomplete coding projects.

Some of the photos will no doubt get posted on here over the coming weeks, but this post isn’t about those. Because, by pure coincidence, I was browsing my Twitter feed this morning and saw this tweet from @ireneista:

we were trying to help a friend get up to speed on how to make a Unix process into a daemon, which is something we found plenty of guides on in the 90s but it’s largely forgotten knowledge

Hang on a minute, I think. Haven’t I just been pulling old incomplete coding projects off my old hard disk and saving them into Github repositories instead? And don’t some of those have exactly that code in? A daemon, on Unix, is roughly the equivalent of a “Service” on Windows. It’s a program that runs all the time in the background on a computer, doing important work.* Many servers don’t even run anything else to speak of. On both Unix and Windows systems, there are special steps you have to take to properly “detach” your code and let it run in the background as part of the system, and if you don’t do all those steps properly you will either produce something that is liable to break and stop running that it’s not supposed to, or write something that fills up your system’s process table with so-called “zombie” entries for processes that have stopped running but still need some bookkeeping information kept about them.

Is this forgotten knowledge? Well, it’s certainly not something I would be able to do, off the top of my head, without a lot of recourse to documentation. For a start all the past projects I’m talking about were written in C, for Linux systems, and I haven’t touched the language nor the operating system much for a number of years now.

None of the projects I’m talking about ever approached completion or were properly tested, so there’s not that much point releasing their full source code to the world. However, clearly, the information about how to set up a daemon has disappeared out of circulation a bit. Moreover, that code was generally stuff that I pulled wholesale from Usenet FAQs myself, tidying it up and adding extra logging as I needed, so compared to the rest of the projects, it’s probably much more reliable. The tweet thread above links to some CIA documentation released by Wikileaks which is nice and explanatory, but doesn’t actually include some of the things I always did when starting up a daemon. You could, of course, argue they’re not always needed. So, here is some daemonisation code I have cobbled together by taking an average across the code I was writing about twenty-ish years ago and adding a bit of explanation. Hopefully this will be useful to somebody.

Bear in mind this isn’t real code: it depends on functions and variables that you can assume we’ve declared in headers, or in the parts of the code that have been omitted. As the old saying goes, I accept no responsibility if this code causes loss, damage, or demons flying out of your nose.**

/* You can look up yourself which headers you'll need to include */

int main(int argc, char **argv)
{
    /* 
     * First you'll want to read config and process command line args,
     * because it might be nice to include an argument to say "dont'
     * run as a daemon!" if you fancy that.
     *
     * This code is also written to use GNU intltools, and the setup for that
     * goes here too.
     */

    /* Assume the daemonise variable was set by processing the config */
    if (daemonise)
    {
        /* First we fork to a new process and exit the original process */
        switch (fork ())
        {
        case -1:
            syslog (LOG_ERR, _("Forking hell, aborting."));
            exit (EXIT_FAILURE);
        default:
            exit (0);
        case 0:
            break;
        }

        /* Then we call setsid() to become a process group leader, making sure we are detached
         * from any terminals */
        if (setsid () == -1)
        {
            syslog (LOG_ERR, _("setsid() failed, aborting."));
            exit (EXIT_FAILURE);
        }

        /* Then we fork again */
        switch (fork ())
        {
        case -1:
            syslog (LOG_ERR, _("Forking hell x2, aborting."));
            exit (EXIT_FAILURE);
        default:
            exit (0);
        case 0:
            break;
        }

        /* Next, a bit of cleanup.  Change our CWD to / so we don't block any umounts, and 
         * redirect our standard streams to taste */
        umask (0022);
        if (chdir ("/"))
            syslog (LOG_WARNING, _("Cannot chdir to root directory"));
        freopen ("/dev/null", "w", stdout);
        freopen ("/dev/null", "r", stdin);
        freopen ("/dev/console", "w", stderr); /* This one in particular might not be what you want */

        /*
         * Listen to some signals.  The second parameters are function pointers which 
         * you'll have to imagine are defined elsewhere.  Reloading config on SIGHUP
         * is a common daemon behaviour you might want.  I can't remember why I thought
         * it important to ignore SIGPIPE
         */
        signal (SIGPIPE, SIG_IGN);
        signal (SIGHUP, warm_restart);
        signal (SIGQUIT, graceful_shutdown);
        signal (SIGTERM, graceful_shutdown);

        /* And now we're done!  Let's go and run the rest of our code */
        run_the_daemon ();
    }
}

The above probably includes some horrible mistake somewhere along the way, but hopefully it’s not too inaccurate, and hopefully would work in the real world. If you try it—or have opinions about it—please do get in touch and let me know.

* NB: this is a simplification for the benefit of the non-technical. Yes, I know I’m generalising and lots of daemons and services don’t run all the time. Please don’t write in with examples.

** “demons flying out of your nose” was a running joke in the comp.lang.c Usenet group, for something it would be considered entirely legitimate for a C compiler to do if you wrote code that was described in the C language standard as having “undefined behaviour”.

Milestones

Or, how and how not to learn languages

I passed a very minor milestone yesterday. Duolingo, the language-learning app, informed me that I had a “streak” of 1,000 days. In other words, for the past not-quite-three-years, most days, I have fired up the Duolingo app or website and done some sort of language lesson. I say “most days”: in theory the “streak” is supposed to mean I did it every single day, but in practice you can skip days here and there if you know what you’re doing. I’ve mostly been learning Welsh, with a smattering of Dutch, and occasionally revising my tourist-level German.

My Welsh isn’t, I have to admit, at any sort of level where I can actually hold a conversation. I barely dare say “Ga i psygod a sglodion bach, plîs,” in the chip shop when visiting I’m Welsh-speaking Wales, because although I can say that I am wary I wouldn’t be any use at comprehending the response, if they need to ask, for example, exactly what type of fish I want. To be honest, I see this as a big drawback to the whole Duolingo-style learning experience, which seems essentially focused around rote learning of a small number of set phrases in the hope that a broader understanding of grammar and vocabulary will follow. I’ve been using Duolingo much longer than three years—I first used it to start revising my knowledge of German back in 2015. When I last visited Germany, though, I was slightly confused to find that after over a year of Duolingo, if anything, I felt less secure in my command of German, less confident in my ability to use it day-to-day. Exactly why I don’t now, but it helped me realise that I can’t just delegate that sort of learning to a question-and-answer app. If I want to progress with my Welsh, I know I’m going to have to find some sort of conversational class.

Passing the 1,000 days milestone made me start wondering if anyone has produced something along the same lines as Duolingo but for computer languages. In some ways it should be a less difficult problem than for natural language learning, because, after all, any nuances of meaning are less ambiguous. I lose track of the number of times Duolingo marks me down because I enter an English answer which means the same as the accepted answer but uses some other synonym or has a slightly different word order. With a coding language, if you have your requirements and the output meets them, your answer is definitely right. In theory it shouldn’t be too hard to create a Duolingo-alike thing but with this sort of question:

Given a List<Uri> called uris, return a list of the Uris whose hostnames end in .com in alphabetical order.

  1. uris.Select(u => u.Host).Where(h => h.EndsWith(".com")).OrderBy();
  2. uris.Where(u => u != null && u.Host.EndsWith(".com")).OrderBy(u => u.AbsoluteUri).ToList();
  3. uris.SelectMany(u => u.Where(Host.EndsWith(".com"))).ToList().Sort();

The answer, by the way, is 2. Please do write in if I’ve made any mistakes by being brave enough to write this off the top of my head; writing wrong-but-plausible-looking code is harder than you think. Moreover, I know the other two answers contain a host of errors and wouldn’t even compile, just as the wrong answers in Duolingo often contain major errors in grammar and vocabulary.

Clearly, you could do something like this, and you could memorise a whole set of “cheat sheets” of different coding fragments that fit various different circumstances. Would you, though, be able to write decent, efficient, and most importantly well-understood code this way? Would you understand exactly the difference between the OrderBy() call in the correct answer, and the Sort() call in answer three?* I suspect the answer to these questions is probably no.

Is that necessarily a bad thing, though? It’s possibly the level that junior developers often work at, and we accept that that’s just a necessary phrase of their career. Most developers start their careers knowing a small range of things, and they start out by plugging those things together and then sorting the bugs out. As they learn and grow they learn more, they fit things together better, they start writing more original code and slowly they become fluent in writing efficient, clean and idiomatic code from scratch. It’s a good parallel to the learner of a natural language, learning how to put phrases together, learning the grammar for doing so and the idioms of casual conversation, until finally they are fluent.

I realise Duolingo is only an early low-level step in my language-learning. It’s never going to be the whole thing; I doubt it would even get you to GCSE level on its own. As a foundational step, though, it might be a very helpful one. One day maybe I’ll be fluent in Welsh or German just as it’s taken me a few years to become fully fluent in C#. I know, though, it’s going to take much more than Duolingo to get me there.

* The call in answer 2 is a LINQ method which does not modify its source but instead returns a new enumeration containing the sorted data. The call in answer 3 modifies the list in-place.

Alternate reality

When you can't use Google as a verb

Many people are concerned just how much corporate technological behemoths have embedded themselves into our lives nowadays. A few years ago now I spent a few days in meetings with some Microsoft consultants at their main British headquarters, and I entertained myself by counting the number of times I saw a pained look on the face of a Microsoft staffer having to physically stop themselves using “Google” as a verb. “We’ll just do a…” wince “…internet search for that.”*

The people I feel sorry for now, though, are the producers of TV shows. Yes, a particular website or app might be key to your plot, it might be vital to the everyday life of your characters, but you can’t use it, because no doubt its owners will be greatly upset if you do. So, for TV, thousands of working hours are spent producing mockup apps and mockup websites for the characters to use on-screen.

An award surely has to go to the producers of Australian police drama Deep Water, a rather good drama series about gay hate murders in Sydney. Their murder victim was obviously going to be using apps such as Grindr to meet guys, but they couldn’t show it on-screen: so, they invented—or, I assume they invented—an app called Thrustr for him to use instead. Now there’s a name that’s even better than the real thing.

What really made me want to write about this, though, is the Netflix series The Stranger, released earlier this year. Its not-Google-honest website is rather tasteful and well-designed, the Google screen layout but with a logo of interconnecting blue dots and lines that could, just about, plausibly be a Google Doodle that isn’t quite legible enough to make out the words of. When it comes to apps, though: they have a whole bevy of them, to fulfil whatever magical device the plot needs at the time. A phone-tracking app that uses some sort of dark-mode map layer for Extra Coolness. An app to allow the organisers of illegal raves to, well, organise illegal raves anonymously, but that also tells you where its anonymous users are. Of course, all these tracking apps always track people perfectly. They always have a mobile data signal and a good GPS fix, even in the city centre. The map view always updates exactly in real time: I hate to think how much battery power they must be using up sending out all those continual location updates.

The Stranger is set in a genericised North-West England: Cheshire and Lancashire with the place-names filed off. Because of that, it has the usual issues any sort of attempt at a “generic landscape” always has when it uses very recognisable places. The characters somehow manage to catch a through train, for example, from the very recognisable Stockport station to the equally recognisable Ramsbottom station, despite one being a busy main-line junction and the other being a silent, deserted heritage line. Talking of trains, there was also a rather fun chase sequence around Bury Bolton Street yard, although the joyless side of me has to say that you really shouldn’t crawl under stock the way they were doing. Nor can you in real life lean against the buffers of the average Mark 1 carriage and stay as clean as the characters did. Anyway. I was saying how unrealistic the GPS-tracking apps on the characters’ phones were: the one that really made me laugh out loud was when one character says that, as a given car registration is a hire car, he’ll be able to hack into the hire firm’s vehicle telemetry and get its current location in the time it takes to boil a kettle.

Admittedly, I have specialist knowledge here, because I used to be in charge of the backend tech for one particular vehicle telemetry provider’s systems. But the whole idea: assuming that they can see from the VRN which hire firm owns the car, they then have to know which telemetry firm that particular hire firm uses, and then know how to get in. Unless you do happen to have a notebook of where every car hire firm gets their telemetry services, and then have backdoors or high-level login credentials to every system, which I suppose is just about plausible for a private investigator, you’re stuffed. Getting in without that? Whilst someone makes you a cup of tea? Not feasible, at all.

Yes, maybe I’m applying unreasonable standards here for keeping my disbelief suspended. It seemed to be a particularly bad example, though, of technology either being magically accurate or terribly broken according to the requirements of the plot at any given time. Does the plot need you to know exactly where someone’s phone is? Bang, there you go. Does it need a system to be breakable on demand within seconds? All passwords to be immediately crackable as long as the right character is doing the cracking? No problem. Oh well: at least their not-Google looked relatively sensible.

* They all used Chrome rather than Edge or IE, though. Things might be different now Chromium Edge is out.

Feeling at home

On inclusion and diversity

Serious posts are hard to write, aren’t they. This article has been sitting in my drafting pile for a couple of months, and has been sitting around taking up space in my head for most of the past year. It’s about an important topic, though, one that is close to me and one that I think it’s important to discuss. This post is about diversity and inclusion initiatives, in the workplace in general, and specifically in the sort of workplaces I’ve experienced myself, so it will tend to concentrate on offices in general and tech jobs in particular. If you work in a warehouse or factory, your challenges are different and I suspect in many ways a lot harder to deal with, but it is not something I am myself in a position to speak on.

It’s fair to say, to start off with, firstly that my career has progressed a lot since I first started this website; and also that attitudes to diversity and inclusion have changed a lot over that time too. I’ve gone from working in businesses where you would have been laughed at for suggesting it at all mattered or should even be considered, to businesses that care deeply about diversity and inclusion because they see that it is important to them for a number of reasons. What I still see a lot, though, are businesses that start with the thought diversity is important, so how do we improve it, and I think that, frankly, they have things entirely the wrong way round. If instead they begin from a starting point of inclusivity is important, so how do we improve it diversity will naturally follow. If you try to make your workplace an inclusive workplace from top to bottom, in across-the-board ways, then you will create a safe place for your colleagues to work in. If your colleagues feel psychologically safe when they are at work, they’ll be more productive, you’ll have better staff retention rates, and people will actively want to work for you.

The Plain People Of The Internet: But I’ve always felt happy at my desk, chair reclined, just being me, anyway. It’s not something we have trouble with!

But this is where the inclusivity part really comes in to it. There are always going to be some people who feel at home wherever they are. They’re usually the people who are happy in their own identity, which is very nice for them. They’re also the people who expect everyone else to go along with what they want, which is less nice. The people who say “well I have to put up with things in my life, so I don’t see why we should make life easier for everyone else,” and “they’re just trying to be different because they want the attention.” These are the people who are going to have to have their views challenged, in order to make the office round them a truly inclusive place for everybody. At the same time, though, you can’t ignore these people, because inclusivity has by definition to include everybody. You have to try to educate them, which is inevitably going to be a harder job.

For that matter, you always have to remember that you don’t truly know your colleagues, however well you think you do—possibly barring a few exceptions such as married couples who work together, but even then, this isn’t necessarily an exception. You don’t know who in your office might have a latent mental health issue. You don’t know who might have a random phobia or random trauma which doesn’t manifest until it is triggered. Whatever people say about gaydar, you don’t know the sexuality of your colleagues for certain—they might have feelings they daren’t even admit to themselves, and the same goes for gender identity and no doubt a whole host of other things. You can never truly know your colleagues and what matters to them, or who they really are inside their heads.

The Plain People Of The Internet: So now you’ve gone and made this whole thing impossible then!

No, not at all; it’s just setting some basic ground rules. In particular, a lot of companies love “initiatives” on this sort of thing, but they tend to be very centralised, top-down affairs: “we’ll put a rainbow on our logo and organise a staff party”. Those aren’t necessarily bad things to do in themselves, but I strongly believe that to be truly successful, inclusivity has to come from the ground upwards. The best thing you can have is staff throughout the organisation who care about this sort of thing, if they can be given the opportunity to gather people around them, educate them about the importance of the whole thing, and push for change from the bottom upwards.

The Plain People Of The Internet: Aha, I get you now! Get all the minorities together, shut the boring white guys out of the room, and get the minorities to tell us how to sort it out!

No! Firstly, the people who you need to get to seed things off are the people who are passionate about it, moreover, people who are optimistic that their passion is going to have an affect. That applies whoever they are, too. If you want to be inclusive, you must never shut out anyone who is passionate about the topic—with certain exceptions that we’ll come to—because, firstly, inclusivity is for everyone, and everyone has a part to play in it. Secondly, as I said above, you don’t know your colleagues: you don’t know why any particular colleague is passionate about it.

Deliberately making inclusivity and diversity the responsbility of the minorities on your staff is, I’d go far to say, nearly always a counter-productive option. For one thing, you want to find passionate people to drive this forward: you shouldn’t automatically assume that everyone who doesn’t fall into a particular “minority” bucket in some way will be passionate about diversity and inclusion, or even that such a bucket exists. Equally, you need to be very wary of some people who will ride the concept as their own personal hobby-horse, and insist that they, personally, should be the arbiter of what diversity means. There are people out there who will insist that because they are disadvantaged in one way or another, they have the right to determine the meaning of diversity and inclusion in any organisations they are part of. These are the sort of people who conflate inclusivity across the whole office with advantage for themselves personally; they will insist that inclusivity and diversity efforts be focused solely on aspects that benefit them, and will attempt first to narrow the scope of diversity and then to gatekeep what is allowed inside. If you’ve followed my logic about diversity flowing from inclusivity and not vice-versa, you’ll immediately see that this is a nonsense. The reason the type of person I’m talking about doesn’t see it as such, is that they see it, even if they don’t realise it, as being something solely for their own benefit in one way or another.

The Plain People Of The Internet: Now you’re not making sense again! Find people that are passionate but not too passionate? You’re just looking for a team of nice milky liberals who won’t really do anything!

It’s difficult, really, to talk about hypotheticals in this sort of area, partly because every organisation and every situation genuine is very different to another. I’m confident, though, that when you do start getting involved in this sort of area it’s straightforward to see the difference in the two different kinds of passion I’m talking about: passion to improve everybody’s lives, or passion to get more for themself. Sadly, the latter are often much louder, but it’s often very clear: they will be the people saying that they know Diversity and can precisely define it, because they are themselves more Diverse than anybody else so know exactly what needs to be done. The people who say “I’m not really sure what diversity is, but I know we need to get everyone’s input on it” are the people that you want on your team.

The Plain People Of The Internet: So what was the point of all this again? Just what are our team trying to do here?

Make your workplace a more inclusive place, whatever that takes. Make sure that nobody feels excluded from social events. Try to make everyone feel that they are on the same broad top-level team. Make sure that “soft” discriminatory behaviour is discouraged,* and that people are educated away from it: for example, teach people to use non-discriminatory language. Make sure your interview and hiring processes are accessible and non-biased—this is particularly important at the moment when doing remote interviewing, because requiring the candidate to pass a certain technical bar is inevitably going to exclude people. But, most importantly, when your passionately inclusive pathfinders of inclusivity come up with ideas and want to get them adopted, make sure they have the support and resources to actually get that done.

The Plain People Of The Internet: And then you’ll magically be Diverse with a capital D?

There’s a lot more to it than that, of course. People have written whole books on this stuff; I can hardly squeeze it all into a single blog post. But if you can find people to transform your office into a more inclusive space—a space where everyone can feel safe and at home—then you are one step along the road. Actually generating that atmosphere: another step. After that, your office will become somewhere that a diverse range of people feel comfortable working in, because it is a fully inclusive space and because everyone across that range can feel at home working there. And then your management can start being proud of being a diverse organisation, rather than deciding that you are going to be Diverse but not knowing how to get there on more than a superficial level.

The Plain People Of The Internet: Feel at home at the office? Pshaw! Terrible idea!

I agree with you completely that the office shouldn’t be your home, “working from home” notwithstanding. It’s still important to separate the two and not hand over your entire soul to the capitalist monster. Nevertheless, much as you might hate working for a living, if you do have to work for a living, it’s important for you to try to be as happy as you can be within that context. Finding a workplace that can be a safe place for you to exist in, whilst not being your home, is one way to go about that. It’s not really what this post is supposed to be about, but it’s a digression it might be a good idea to explore at some point.

This post is getting a bit long now, judging by the way my scrollbar is stretching down the screen. It’s a personal view. I don’t pretend to know all the answers, and it’s not a field I claim to be an expert in, but it’s a field that is important to me personally and it’s a suggestion towards a sensible approach to take. Diversity is important to all of us, because we are all diverse: none of us is any more diverse than the other, and none of us has the right to judge another’s lifestyle as long as it causes others no harm.** The key thing, to my mind, is accepting that genuine diversity does require acceptance and appreciation of this; and that if you want to become diverse, becoming inclusive first is by far the easiest approach.

The dichotomy really, I suppose, is between organic growth and forced construction. Consider, if you’ll forgive me another painful analogy, your workforce as the shifting sands of a beach. If you build a Tower of Diversity and Inclusion on top of those shifting sands, it will fall, or get swallowed up by the dunes. If you let a Forest of Inclusion and Diversity grow up through the sand, it will hold it together and make it more cohesive. I know it’s a bit of a daft analogy really, but hopefully it helps you see what I’m trying to painfully and slowly explain. If you try to be inclusive, and if you turn your workplace into a safe space for everyone to be themselves, the latter is hopefully what you will be able to grow.

* I’m working on the basis here that “hard” discriminatory or offensive language or behaviour is immediately called out and shut down, which I know isn’t always the case in all workplaced.

** I have cut a whole section out of a previous draft of this post, discussing how to spot people who use diversity as a shield to do horrible things. Hopefully, in most situations, it’s not something people have to worry about, but it does happen. It’s a shame that we do have to worry about these situations, but they do happen. Going round again, though, if an inclusive workplace is one where people feel safe to be themselves, it’s also one where hopefully people feel safe to report any transgressions and make sure they are dealt with. I have, sadly, heard of people who use diversity-styled language to try to defend themselves against accusations of abuse or of sexually predatory behaviour, and I’m not surprised there are some who think that diversity is some sort of loophole in that regard, because some people will always take whatever advantage they can.