Anatomy of a bug – how Mac OS X chokes if you say “FILE”

You’ve heard of saying “BOO!” to a goose.

This is a matter of saying FILE colon slash slash something to any of a raft of applications on OS X.

You may have read about this bug already, as it’s understandably made geeky headlines amongst Mac techies.

The most common version of the story tells you to open TextEdit (which you can think of as OS X’s NOTEPAD.EXE) and type in the text shown below, just as you see it there.

→ I’ve deliberately circumlocuted the offending text above, just in case you’re reading this on a Mac in an affected application. I don’t want to crash your browser, reader or RSS software by mistake. Elsewhere I’ll write it as FILECSSx or fileCSSx just in case.

Some of the stories specify that you have to type File, exactly like that, with an upper-case F and a lower-case ile. Actually, that’s not true: you just have to capitalise one or more of the letters, and I’ll explain why in a moment.

Shortly after you’ve typed the third slash, TextEdit will crash:

If you choose the Report... option, a system application called Problem Reporter pops up, containing a raft of debugging information you can send to Apple if you choose.

But before you can read what’s gone wrong, Problem Reporter crashes, too! (Problem Reporter is smart enough not to pop back up to offer to report the problem in Problem Reporter.)

Running TextEdit in a debugger quickly gives some insight into the problem:

The trouble happens in a system library called DataDetectorsCore, which lots of applications use to recognise and act upon special content in a text window, such as URLs, telephone numbers and so forth.

Since the text string fileCSSx denotes a local, file-based URL, it’s just the kind of text you’d expect Apple’s data detector code to locate and react to.

Unfortunately, it’s not just typing in the offending text that crashes the application, but having it displayed in the first place.

That’s why Problem Reporter crashes even before you type or click anything, because it prints out the offending text in its message window. (Happily, the Terminal application, where you run the Apple debugger, is not affected by this bug.)

Decompiling the misbehaving library at the problematic function tells us more:

In programmer-speak, this snippet of code is known as an assertion, and coders are encouraged to use this sort of this-absolutely-must-be-true test to make their code resilient.

The idea is to be unrelenting about your assumptions.

If you’re about to process a login, for example, and you’re at a point in your code where a blank password ought already to have been rejected and reported as an error, you might use an assertion to check that your password buffer really isn’t empty.

At the time an assertion is checked, it’s too late to correct the problem you’re worried about; yet it’s unsafe to continue if there is a problem. So a failed assertion causes the program to terminate.

→ An assertion is a bit like a roadworthy test: you can drive to the test, but if your car fails badly enough, you might not be allowed to drive it home, for your own and everyone else’s safety.

The code above deals with a file-based URL, so the programmer has decided to reassure himself that the URL really does start with the characters fileCSS.

He’s done this using the system function CFStringHasPrefix, which does a case-sensitive comparison.

But the data detector that calls this code is more liberal, and recognises [Ff][Ii][Ll][Ee]://, which is computerese for the word “file” spelled out in any mix of case.

In other words, if you type FileCSSx, or fIleCSSx, or any other similar combination, the assertion fails, and, as we’ve seen already, it’s DDCrash city.

Either the recogniser that matches text other than lower-case-only fileCSSx is being too liberal, or the assertion is being too strict.

Of course, this is just the sort of impasse that an assertion is designed to deal with: if in doubt, come to a dead stop at once.

That’s cold comfort to users of Safari (try typing FileCSSx in the address bar), Apple Mail (try using FileCSSx in the subject line), and many more applications that knowingly or unknowingly make use of DataDetectorsCore, which is part of Apple’s AppKit development libraries.

What was intended as a safety check can now be abused for a denial-of-service that is annoying at best.

What to do?

My first reaction came from my hacker’s stomach: get rid of the assertion!

Change the conditional jne (jump if not equal to) instruction in the highlighted code above to an unconditional jmp, so the code behaves as though the assertion always succeeded.

Using a hex editor, I searched in the system library file /System/​Library/​PrivateFrameworks/​DataDetectorsCore.framework/​Versions/​Current/​DataDetectorsCore for the byte string highlighted above, and altered 75 (the opcode for jne) to EB (which is an 8-bit-offset jmp):

What can I say? It works. So far, anyway. But it’s an outrageous hack, and in a production environment, altering official system binaries is strongly discouraged.

In particular, I’ve done much more than merely to relax the assertion slightly to let close-to-correct strings like FileCSSx and fiLeCSSx through to the underling code in DDResultCopyExtractedURL.

My hacked library will now tolerate absurdly mismatched input like GoNDwanaLand:$$$ and #$%!!#$%@ as well, which could expose the code to vulnerabilities from which it was supposed to be shielded by the assertion.

Is there a better way?

Yes, and I’m not the only one to have tried it and found it satisfactory. Fellow Aussie Daniel Tang, who’s a Mac debugger fan even though he’s still at school, wrote this up and reported success.

You can’t safely delete or rename the /System/​Library/​PrivateFrameworks/​DataDetectorsCore.framework/​DataDetectorsCore system file, because applications like TextEdit rely on finding it at load time.

If it’s not there, OS X will give you an error:

→ Whatever you do, don’t experiment by renaming DataDetectorsCore unless you have a terminal already open! Terminal itself depends on that shared library (even though Terminal doesn’t seem to be affected by the bug), so you won’t easily be able to get to a root shell to recover from your experiment.

But DataDetectorsCore itself relies on a set of sub-libraries that it loads at run-time.

If they aren’t there, it doesn’t seem to mind. You just don’t get the data detection functionality they would have provided.

On a standard system, they are:

If you rename the urlifier library, say by appending .turned-off to the filename (so you can easily restore it if needed), then automatic URL detection in your documents will no longer work, but nor will the FileCSSx bug.

Again, this isn’t an official workaround, so you use it at your own risk. But it worked for Daniel and it’s working for me.

If you prefer to keep on more official ground, I suspect that Apple will proffer a fix pretty soon, so make sure to check for updates if you aren’t already doing so automatically.

Hope you found this bug entertaining – enjoy it while it lasts.