Anatomy of a bug - misplaced parenthesis threatens NetBSD's random numbers

Filed Under: Cryptography, Featured

Oh, frailty, thy name (with apologies to [William {The Bard}] Shakespeare) is parenthesis.

What a difference a misplaced bracket makes!

As our friends at The Register reported last week, the NetBSD coders recently patched a programming bug in their kernel that affected the sanctity of the operating system's random numbers.

And, as we have explored before on Naked Security, good quality random numbers are a vital aspect of modern computing.

In particular, cryptography requires random numbers that are not only random (meaning that there is no bias towards ones or zeros in the bits that appear), but also unpredictable (meaning that you can't guess what comes next even if you have an extensive collection of previous output).

Modern Unix and Unix-like operating system kernels typically provide two commonly-used sources of randomness, named /dev/random and /dev/urandom.

Both mix in input that isn't entirely dependent on software, such as mouse movements, disk latency measurements, network traffic, keyboard activity, and more.

→ Mouse and keyboard movements are not a terribly good source of entropy, as "lack of order and predictability" is called in the field of random numbers. Indeed, user interaction never happens on headless servers. But mixing in at least some data extracted from the real-time behaviour of the underlying hardware, especially measurements that are affected by external factors such as temperature or load imposed by other devices on the network, helps to reduce the predictability of algorithmically generated output.

In practice, the difference between urandom and random is that the former continues unabated even if the external entropy feed runs dry, falling back on purely software-based output, while the latter stream of data may block, meaning that a program reading it will freeze until the entropy pool acquires some new input.

You can demonstrate this on a Unix-like system by running the command od -Ax -t x1 /dev/random and letting it run:

Every now and then, you should see the output pause for a while as the system's hardware-derived entropy pool dries up; wait a while (or wiggle the mouse to provide some external input to the pool) and the flow of random bytes should resume.

So, part of the promise of /dev/random is that it tries really hard to be random.

In NetBSD, which uses an AES-128-based random number generator keyed independently for each process that uses /dev/random, fulfilling that "promise" was done with code in C similar to this:

Turned into pseudocode, this is supposed to achieve the following result (the variable key is an AES-128 cipher key, 128 bits or 16 bytes in length):

read up to 16 bytes of high-quality random data
if we got fewer that 16 bytes then:
    produce a warning about entropy
    top up the 16 bytes with lower-quality data

Even if there is a shortage of high-quality entropy data, your random number stream will nevertheless consist of AES output keyed by a full 16 bytes of at-worst-pseudorandom data that is unique to your use of /dev/random.

Except that the C code above doesn't actually do what was intended. It was supposed to be:

Notice the subtle-yet-critical difference between sizeof(key-r) and sizeof(key)-r.

Due to the pecadilloes of C, sizeof(key) works out to be the size of the memory buffer key, which is defined in the NetBSD code similarly to this:

unsigned char key[16];  /* 128-bit AES */

Since r is the number of high-entropy bytes read in so far, sizeof(key)-r, which is what the programmer meant to write, works out to be the number of bytes by which r fell short of 16.

So, topping up the original r bytes of random data with a further sizeof(key)-r bytes of pseudorandom data ensures that a full 16 bytes of random data (whether of GOOD or ANY quality) are always used to seed the random generator.

This is so because, perhaps rather obviously, r plus sizeof(key)-r is equal to sizeof(key), i.e. 16.

But the programmer wrote sizeof(key-r) by mistake.

That doesn't really make sense, at least to a human, since sizeof() in C usually produces a constant determined at compile time, while r is the variable amount of data read in at runtime.

Sadly for NetBSD, however, a fixed memory address (in this case, the address of key) plus or minus a variable integer (in this case, r) is considered by the C compiler to be an address with an offset: in other words, just another memory address.

And so sizeof(key-r), computed at compile time, is equivalent to sizeof(the space needed to store any memory address), i.e. the size of a pointer variable.

That is not the same thing as sizeof(memory actually dedicated to the array variable key).

Indeed, on a 64-bit system, sizeof(key-r) is 8; on a 32-bit system, with 32-bit memory addresses, it's only 4.

So, in the worst case, the erroneous code might perform as follows:

read up to 16 bytes of high-quality random data
oh dear, no high-quality data at all, so:
    produce a warning about entropy
    rely on just 4 bytes of lower-quality data

The bug was easily fixed, although it has been fixed again since The Register's article last week, following a decision that the original fix wasn't entirely satisfactory:

The good news is that in fixing the fix, the coder reviewing the original error came to the conclusion that even an ineptly-keyed random stream would probably not be predictable.

That's because the random generator continues mixing in additional input between the initial "keying" stage and the point at which the user starts getting data from it, adding entropy over and above the minimum four starting bytes.

Nevertheless, there are two lessons here that every C programmer needs to remember:

  • Watch those brackets.
  • The sizeof() operator isn't a function.

, , , , ,

You might like

23 Responses to Anatomy of a bug - misplaced parenthesis threatens NetBSD's random numbers

  1. Mark · 506 days ago

    Is it me but are the two screenshots above the same.

    Shouldn't one be sizeof(key-r)?

  2. Brad · 506 days ago

    Typo in the current png files?

    They both have the exact same content, even down to the MD5 signature. :(

  3. Chris · 506 days ago

    Thanks for the write-up. I hate to be pedantic, but your two PNG snippets of the C are identical: one should say sizeof(key-r) (bad) and the other sizeof(key)-r (good).

    C.

  4. Ben · 506 days ago

    Great and interesting post. Just wanted to point out that the example C code is identical (sizeof(key) - r) in both instances which makes it hard to follow.

  5. Rafael Jacob Werlang · 506 days ago

    Looks like there was also a memory corruption vulnerability there.
    If the first rnd_extract_data function call returned 13, 14 or 15 bytes of data on a 32bit processor, or even 9 to 15 bytes on a 64bit processor, seems like data right after the memory allocated for the key would get corrupted. I couldn't take a look at the whole function yet but if it is the case, which it clearly seems to be, its more alarming than just a flaw with the pseudo-randomness there.

  6. Tom · 506 days ago

    Your screenshots are the same, there's no difference...

  7. David Rinehart · 506 days ago

    Interesting article. It's hard enough making sure the parentheses are closed, although at least the compiler helps with that; to get them closed in the right place is where art and science meet...

    Two notes:

    - in first paragraph, The Bard's name is Shakespeare, not Shakepseare
    - the two graphics of the code appear to be identical; at least, the "sizeof(key)-r" is the same in both graphics.

  8. Polonius · 506 days ago

    Unfortunately, your first two images of C code, which are intended to show the difference between the actual and intended code, appear to be identical. Then again, the whole story's probably an April Fool!

    • Paul Ducklin · 506 days ago

      Ow.

      (Nice excuse! Sadly, it was not an April Fool, but merely a case of the same file uploaded twice under different names :-)

  9. Guest Viewer · 506 days ago

    Unless I am missing something, your examples, in the two blue boxes, are 100% identical. The critical piece - sizeof(key)-r - is the same in both examples, leading the reader to... confusion.

  10. Ash · 506 days ago

    The "actual" and "supposed to be" images are identical. The text explaining the issue is right but the pictures don't match.

  11. John Baxter · 506 days ago

    Please recheck the first code snippet. (I think the ...sizeof(key-r) got fixed in producing the post, or the wrong image was picked up.) Then feel free to drop this comment.

    • Paul Ducklin · 506 days ago

      Ow.

      (I felt obliged to keep your comment since all the others made it through :-)

  12. Robert · 506 days ago

    I think you accidentally used the "correct" code for both images? both show "sizeof(key)-r"

  13. in your example you properly placed the parentheses on both the before and after

  14. Random reader · 506 days ago

    You can't notice the subtle-yet-critical difference between the blue boxes because they are exactly the same.

    "That doesn't really to make sense..." doesn't make sense. How about a little proofreading to make the subtle-yet-critical difference between meaningful commentary and utter nonsense.

    • Paul Ducklin · 506 days ago

      Thanks for your correction.

      But try reading the (very many) previous comments to understand the subtle-yet-critical difference between making your point and overstating your case to the point of egregious rudeness :-)

      (I suggest that "doesn't really to make sense" is actually comprehensible to most readers of English, since natural language often provides enough semantic redundancy to allow decent error correction. And I think you meant "how about *sufficient* proofreading", since "a little proofreading" is what I did and it clearly wasn't enough.)

  15. Rafael Jacob Werlang · 505 days ago

    I am glad I am the only one who did not say anything about the misplaced picture.
    And then I am glad again I was the only one who spotted the memory corruption vulnerability, which should be stated as the worst case scenario, not low entropy on prng.

  16. FR · 505 days ago

    The other irony is that sizeof is not an function, it's an operator. So you don't even need the brackets.

    Not that I'm a language lawyer, but a bit of parsimony sometimes goes a long way..

    Bernard Shaw said it right such a long time ago.

    :-)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

About the author

Paul Ducklin is a passionate security proselytiser. (That's like an evangelist, but more so!) He lives and breathes computer security, and would be happy for you to do so, too. Paul won the inaugural AusCERT Director's Award for Individual Excellence in Computer Security in 2009. Follow him on Twitter: @duckblog