Sysadmin SNAFU flushes whole company down the drain

Here’s a fun story that’s doing the rounds right now.

It’s the perfect anecdote to cheer up a Friday afternoon.

Actually, it isn’t, because it’s all about someone else’s deep misfortune.

It’s more of a There, but for the grace of God, go I, but we thought we’d tell the tale anyway, so that you don’t go there yourself.

Just imagine…

Imagine that you were going to make a backup last night, but you never quite got a Round Tuit.

This morning, you got slammed by ransomware that scrambled all your files, so you breathed in really deeply, set your jaw firmly, got out your Bitcoin wallet

…only to find that the crooks wouldn’t take your money, didn’t care about your files, just shrugged and walked away.

Except they didn’t just wipe your files, they wiped everybody’s: your own files, your staff’s files, your customer’s files, along with their web server configurations, their emails, their operating systems, everything, the whole nine yards, washed down the drain, into the river, out to the North Sea. [You’re mixing your metaphors againEd.]

And, anyway, it wasn’t the crooks that did it – it was you that did the damage, self-inflicted with a simple slip of the fingers.

Stuff happens

It happens, and sometimes a bit of what you might call “tough love” is all you’re going to get, which is what happened to a user called Bleemboy on Server Fault when he asked this question:

Bleemboy: I run a small hosting provider with more or less 1535 customers… All servers got deleted and the offsite backups too because the remote storage was mounted just before by the same script (that is a backup maintenance script).

How I can recover from a rm -rf / now in a timely manner?

“Tough love” answers came back within about half an hour from AndrΓ©, Sven and Michael:

AndrΓ©: If you really don’t have any backups I am sorry to say but you just nuked your entire company.

Sven: I feel sorry to say that your company is now essentially dead.

Michael: You’re going out of business. You don’t need technical advice, you need to call your lawyer.

To explain…

On a Unix-like system, rm is the command to remove a file, or to delete it, in the slightly blunter terminology of Windows.

The / means “the root directory,” short for starting at the very top of everything.

The -r means “recursive”, which is geek-speak for saying that you want to delete the subdirectories too, oh, and if they have subdirectories, even if they’re mapped to other drives on the network, or have removable disks mounted…heck, it means “spare nothing.”

Then, to make assurance double-sure, there’s -f, for “force,” which means not only that you won’t take no for an answer, but also that you don’t even want to bother asking in the first place.

Why in a script?

But why would any sysadmin put rm -rf / in a script, not least because the script would inevitably be one of its own victims? [Not necessarily, e.g. due to chroot, but don’t let me interrupt youEd.]

Surely you’d notice the self-contradictory nature of such a command?

In this case, the unfortunate sysadmin had written something like:

rm -rf $1/$2

The idea is that the items with the dollar signs are variables that get replaced at runtime, for example by setting $1=user/16504 and $2=retired-files/, so that the script can be used to handle archiving for different users and different directories at different times.

Unfortunately, as Bleemboy himself pointed out:

Those variables [were] undefined due to a bug in the code above.

You can figure out what happens if you replace $1 and $2 above with nothing at all.

What to do?

In computer science courses – even those that supposedly don’t explicitly deal with security – you will learn (and, hopefully, learn to appreciate) all sorts of generic protections against this sort of bug.

Security-conscious programming languages can help if they detect, trap and stop code where variables aren’t defined, to make sure you say what you mean, and mean what you say.

Pair-programming can help, where you always work with a co-pilot, regularly swapping roles, so there are always two pairs of eyes on the job, and there’s always someone on the spot to ask the difficult questions when you start getting careless.

A vigorous testing process is vital, where you don’t just crack out the code and check that it mostly works, but also produce code to help to test your code, which includes testing that it fails correctly too, an outcome that is not an oxymoron in software engineering.

Security wrappers can help, too, like the safe-rm flavour of the rm command that lets you keep a “defence-in-depth” blocklist of files that should never be deleted, even if you try very hard, in order to protect you from yourself.

But the big one here is backup.

If ransomware has one silver lining, it’s the fact that it’s getting backup a bit closer to the front of our minds.

The only backup you will ever regret…

…is the one you didn’t make.

(Encrypt your backups. That way if someone steals your offsite disks, there’s still no data breach.)

Image of sewer outfall pipe courtesy of Shutterstock.