Anatomy of a buffer overflow - Google's "KeyStore" security module for Android

Filed Under: Android, Featured, Google, Vulnerability

Here's a cautionary tale about a bug, courtesy of IBM.

Not that IBM had the bug, just to be clear: Google had the bug, and IBM researchers spotted it.

Just like Google's serial SNAFUs in the Android cryptographic verifier last year, this bug has the added interest of irony, because it was found in a security-related part of Android called KeyStore.

Click to read the IBM paper...

Why KeyStore?

If you've ever wondered how Android apps keep things like cryptographic keys secure, then you have wondered wisely.

After all, mobile apps form an ecosystem that has given rise to some egregiously insecure behaviour - behaviour not by you, the user, but by the suppliers of the software themselves.

Snapchat, for example, claimed that your risqué photos would vanish after their first viewing, but didn't even bother to delete the image files from the from the recipient's phone, and that's before you even consider how bogus the company's claim was in the first place.

WhatsApp bravely tried to add a layer of encryption to your messages, but managed to invent a flawed cryptosystem that amounted to a one-time pad used twice. (It's not called a one-time pad for nothing!)

And a survey from January 2014 showed that 40% of online banking apps - software officially promoted by financial institutions for secure mobile banking - didn't bother to check HTTPS certificates and so could be fooled by Man in the Middle (MitM) phishing crooks at any internet-enabled coffee shop.

So, at least as far as the secure storage of cryptographic keys was concerned, Google decided to introduce a standardised way for apps to store credentials, in the hope of adding some order to the mobile cryptographic chaos.

Better yet, the KeyStore implementation, significantly enhanced in Android 4.3, automatically ensures that each app's KeyStore data is strongly encrypted with its own unique key as well as a key derived from your lockscreen passcode, and uses special-purpose hardware for key storage if it is available (as it is on the Nexus 7, for example).

So far, so good.

The KeyStore implementation

When you start looking at the main KeyStore source code file, keystore.cpp, you'll soon notice this comment:

To keep things simple, buffers are always larger than the maximum space we needed, so boundary checks on buffers are omitted.

Your eyes may have popped open there: if boundary checks on buffers have deliberately been omitted, then if the coder has slipped up even once so that a buffer overflow is possible after all, you can be certain that it won't be detected, let alone stopped.

The code is written in C++, but there are plenty of C-style strings in there, including strings allocated on the stack, which in C code is done like this:

Don't worry if you aren't fluent in C; all you need to know is that this code snippet takes a keyName (and the uid, or user ID, that is unique to the app requesting the key) and converts it into a filename by which it will be represented in the KeyStore system.

It does so by performing some sort of manipulation, determined by the encode_key_for_uid() function, that reads data bytes out of the memory buffer called keyName and saves them into the new variable filename, which is defined by the line:

That's C's way of saying, "reserve me a buffer of characters (bytes) on the stack that is NAME_MAX bytes long."

NAME_MAX, as it happens, is set elsewhere in the Android code to 255, the longest filename that's allowed, because there is no point in trying to create a file with a longer name.

Alarm bell ringing

You may hear an alarm bell ringing in your head already.

The keep-calm-and-carry-on comment above assures us that buffers are always larger that the maximum space we need, so that watching out for overflows is an irrelevancy.

So that's one promise as good as broken by the programmer already, albeit only in a pedantic sense, because filename is exactly as big as the maximum length of a filename, not "always larger."

And you'll also notice that the programmer doesn't tell the encode_key_for_uid() function how big the filename buffer is.

In C, strings like filename really are just plain, unadorned and unannotated arrays of bytes; they don't include a record of how many bytes they are allowed to hold.

This is why buffer overflows are easy mistakes to make in C: when you copy num bytes from string src to string dest, string src doesn't "know" whether it actually contains num bytes for you to read from, and string dst doesn't "know" whether it has num bytes of space for you to write to.

You have to keep track of how much memory is allocated to each string yourself.

Is filename big enough?

And that raises the worrying question, "Is filename actually big enough for the data from keyName that is about to be copied into it?"

Our first stop is to dive into encode_key_for_uid() and see what it does:

Again, don't worry if you don't know C; this code writes the numeric user ID of the calling app (uid), as a text string of decimal digits followed by an underscore, into the out parameter, which you will remember from above is our filename buffer of 255 bytes.

Because an Android user ID is stored in 32 bits, it can never be larger than 232-1, which is about 4 × 109, or 10 decimal digits.

So at most 12 bytes (10 digits, an underscore, and a zero-byte, or NUL, used to terminate a C text string) will be written into our filename buffer, which is 255 bytes long.

Note that the programmer tried to do the right thing here, by using the function snprintf(), which is slightly cryptic shorthand for "do a formatted print of at most N bytes into a string buffer."

The coder has deliberately set a cap of 255 bytes on the maximum number of bytes to write, even though he knows he's well within the limit, and despite saying at the start that boundary checks were unnecessary and had been omitted.

This code is poor despite the use of snprintf(). The programmer has stated the length of filename explicitly both in the calling function, where he defined it (i.e. where it actually acquires its 255 bytes of memory), and in the called function. So if ever you should later change the size of the buffer where it is defined, you'll need to dig into every place where the buffer is used and change the size as well. If you forget, you will introduce a bug. The coder should have passed the value sizeof filename into the called function instead.

Next, we dive further, entering the encode_key() function, which copies more bytes into filename.

This time, it doesn't bother to check whether there is enough space:

This function shovels bytes from the in string, which is our keyName, into the out string, our filename, until there are no more to be copied.

Regular printable characters (1) are copied unchanged, but (2) if the input byte would be an illegal or unprintable filename character, the algorithm (3) encodes it into two characters that are safe for use in a filename. This is already a worrying detail considering that the code assumes all buffers are big enough. Our 255-byte filename buffer could, in theory, be overflowed by a maliciously chosen keyName of just 128 bytes, if each input byte were deliberately chosen so that it encoded into two output bytes. Finally, a (4) NUL character is added to terminate the string - an extra byte for which the programmer forgot to reserve space at numerous places in the code. (See below.)

What if keyName is too long?

As you can imagine, at this point the IBM researchers wondered, "What happens if keyName is too long to fit?"

The programmer promised us that all buffers would be big enough, but a search of the code shows that this is untrue.

The filename buffer is hard-wired at 255 characters, while the keyName parameter is passed in from an app and can be any size (and contain any byte sequence) the calling app wants.

So we have a buffer overflow on the stack that can be precisely controlled by a malicious app.

There's no guarantee that this vulnerability could ever be exploited for more than a denial of service attack (crashing the KeyStore system), but it's a vulnerability neverthless.

Google fixed it in Android 4.4, but not very thoroughly: the code still relies on the assumption that "buffers are always larger than the maximum space we needed, so boundary checks on buffers are omitted."

Interestingly, Google also spotted many other places where the coder had forgotten about the NUL byte that terminates C strings and incorrectly allocated buffers that were exactly one byte too small for their output, apparently without causing any errors to show up in testing.

These flaws were fixed with liberal use of + 1, like this:

Advice for C programmers

Try these tips:

  • Avoid allocating string buffers on the stack. Unlike heap overflows, it's much easier for a hacker to predict what other variables might be overwritten if an overflow should happen.
  • Always keep track of string sizes, and check that there is space before writing to a buffer. Even you feel certain that your checks are redundant, put them in. If nothing else, it makes your code safer against future changes, and easier for others to understand.
  • Never tell yourself that "buffers are always larger than the maximum space we need," especially if you allow them to be filled from input buffers of unlimited size passed in by an app that uses your code.
  • Don't let code through review or QA if it makes claims like "boundary checks on buffers are omitted as unnecessary." They probably are necessary, and even if they aren't, they are always desirable.

In short: write your code under the assumption that it will be reviewed by another programmer who is meticulous, severe and pedantic; and that it will be called by a crook who is devious, motivated and persistent.

By all means expect the best, but never forget to prepare for the worst.

, , , , , ,

You might like

4 Responses to Anatomy of a buffer overflow - Google's "KeyStore" security module for Android

  1. Comments: Why is this string being allocated on the stack? Why is the length of the string not parametrized in the subsequent function calls? Strings in C are not objects, their length always has to be passed with them. Jesus christ, what is this, 1996?

  2. Bob · 115 days ago

    I don't know whether to laugh or cry - probably cry.

    • Paul Ducklin · 115 days ago

      Maybe you could just do one of those smiles that segues into a grimace?

  3. VL-S · 114 days ago

    Just a question about revisions. By 4.4 do you mean 4.4.n. I ask this because my Nexus just updated from 4.4.3 to 4.4.4.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

About the author

Paul Ducklin is a passionate security proselytiser. (That's like an evangelist, but more so!) He lives and breathes computer security, and would be happy for you to do so, too. Paul won the inaugural AusCERT Director's Award for Individual Excellence in Computer Security in 2009. Follow him on Twitter: @duckblog