Anatomy of a bug – misplaced parenthesis threatens NetBSD’s random numbers

Oh, frailty, thy name (with apologies to [William {The Bard}] Shakespeare) is parenthesis.

What a difference a misplaced bracket makes!

As our friends at The Register reported last week, the NetBSD coders recently patched a programming bug in their kernel that affected the sanctity of the operating system’s random numbers.

And, as we have explored before on Naked Security, good quality random numbers are a vital aspect of modern computing.

In particular, cryptography requires random numbers that are not only random (meaning that there is no bias towards ones or zeros in the bits that appear), but also unpredictable (meaning that you can’t guess what comes next even if you have an extensive collection of previous output).

Modern Unix and Unix-like operating system kernels typically provide two commonly-used sources of randomness, named /dev/random and /dev/urandom.

Both mix in input that isn’t entirely dependent on software, such as mouse movements, disk latency measurements, network traffic, keyboard activity, and more.

→ Mouse and keyboard movements are not a terribly good source of entropy, as “lack of order and predictability” is called in the field of random numbers. Indeed, user interaction never happens on headless servers. But mixing in at least some data extracted from the real-time behaviour of the underlying hardware, especially measurements that are affected by external factors such as temperature or load imposed by other devices on the network, helps to reduce the predictability of algorithmically generated output.

In practice, the difference between urandom and random is that the former continues unabated even if the external entropy feed runs dry, falling back on purely software-based output, while the latter stream of data may block, meaning that a program reading it will freeze until the entropy pool acquires some new input.

You can demonstrate this on a Unix-like system by running the command od -Ax -t x1 /dev/random and letting it run:

Every now and then, you should see the output pause for a while as the system’s hardware-derived entropy pool dries up; wait a while (or wiggle the mouse to provide some external input to the pool) and the flow of random bytes should resume.

So, part of the promise of /dev/random is that it tries really hard to be random.

In NetBSD, which uses an AES-128-based random number generator keyed independently for each process that uses /dev/random, fulfilling that “promise” was done with code in C similar to this:

Turned into pseudocode, this is supposed to achieve the following result (the variable key is an AES-128 cipher key, 128 bits or 16 bytes in length):

read up to 16 bytes of high-quality random data
if we got fewer that 16 bytes then:
    produce a warning about entropy
    top up the 16 bytes with lower-quality data

Even if there is a shortage of high-quality entropy data, your random number stream will nevertheless consist of AES output keyed by a full 16 bytes of at-worst-pseudorandom data that is unique to your use of /dev/random.

Except that the C code above doesn’t actually do what was intended. It was supposed to be:

Notice the subtle-yet-critical difference between sizeof(key-r) and sizeof(key)-r.

Due to the pecadilloes of C, sizeof(key) works out to be the size of the memory buffer key, which is defined in the NetBSD code similarly to this:

unsigned char key[16];  /* 128-bit AES */

Since r is the number of high-entropy bytes read in so far, sizeof(key)-r, which is what the programmer meant to write, works out to be the number of bytes by which r fell short of 16.

So, topping up the original r bytes of random data with a further sizeof(key)-r bytes of pseudorandom data ensures that a full 16 bytes of random data (whether of GOOD or ANY quality) are always used to seed the random generator.

This is so because, perhaps rather obviously, r plus sizeof(key)-r is equal to sizeof(key), i.e. 16.

But the programmer wrote sizeof(key-r) by mistake.

That doesn’t really make sense, at least to a human, since sizeof() in C usually produces a constant determined at compile time, while r is the variable amount of data read in at runtime.

Sadly for NetBSD, however, a fixed memory address (in this case, the address of key) plus or minus a variable integer (in this case, r) is considered by the C compiler to be an address with an offset: in other words, just another memory address.

And so sizeof(key-r), computed at compile time, is equivalent to sizeof(the space needed to store any memory address), i.e. the size of a pointer variable.

That is not the same thing as sizeof(memory actually dedicated to the array variable key).

Indeed, on a 64-bit system, sizeof(key-r) is 8; on a 32-bit system, with 32-bit memory addresses, it’s only 4.

So, in the worst case, the erroneous code might perform as follows:

read up to 16 bytes of high-quality random data
oh dear, no high-quality data at all, so:
    produce a warning about entropy
    rely on just 4 bytes of lower-quality data

The bug was easily fixed, although it has been fixed again since The Register’s article last week, following a decision that the original fix wasn’t entirely satisfactory:

The good news is that in fixing the fix, the coder reviewing the original error came to the conclusion that even an ineptly-keyed random stream would probably not be predictable.

That’s because the random generator continues mixing in additional input between the initial “keying” stage and the point at which the user starts getting data from it, adding entropy over and above the minimum four starting bytes.

Nevertheless, there are two lessons here that every C programmer needs to remember:

  • Watch those brackets.
  • The sizeof() operator isn’t a function.