Linus Torvalds is a very clever man – he invented Linux, after all – but he seems to struggle with simple human decency.
(He recently expressed the wish that the designers of some hardware he doesn’t like might “die in some incredibly painful accident“, and invited you to puncture the brake lines on their car as a way to make it so.)
So it’s hardly surprising that when he heard a cryptographic suggestion he thought was silly, he let rip like this:
Where do I start a petition to raise the IQ and kernel knowledge of people? Guys, go read drivers/char/random.c. Then, learn about cryptography. Finally, come back here and admit to the world that you were wrong. Short answer: we actually know what we are doing. You don't. Long answer: we use RDRAND as _one_ of many inputs into the random pool, and we use it as a way to _improve_ that random pool. So even if RDRAND were to be back-doored by the NSA, our use of RDRAND actually improves the quality of the random numbers you get from /dev/random. Really short answer: you're ignorant.
That’s right – yet more “NSA cracked my crypto” conspiracy, and this time, the rudest man in Linuxdom is in the thick of it!
Interestingly, there are some useful lessons to be learned here – and they’re more about how to deal with technical issues well than they are about surveillance or digital snooping.
So, at the risk of receiving a Royal Rant from Torvalds himself (me for writing this, and you for reading it), let me explain.
Linux has a special file called /dev/random that doesn’t exist as a real file.
If you open it in a program, and read from it, you get a stream of pseudorandom numbers, generated right inside in the kernel.
The idea of doing the work in the kernel is to end up with randomess of a very high quality.
That means minimal bias (the next bit is always zero with a probability of 50%), minimal predictability (even if you have a detailed history of recent outputs), and minimal repeatability (you can’t trick the system into giving the same sequence twice).
The way this works, very loosely speaking, is that the kernel continually sucks in pseudorandom data from various hard-to-predict sources – how much did the mouse move last time? how quickly did you type? how much time elapsed between two hardware events? – and stirs it all together into a bucket of digital slurry.
Along the way, the pseudorandom inputs are each shovelled through a non-cryptographic hash function to hasten the slurrification.
An estimate is kept – expressed in bits – of the amount of randomness that has been mixed in so far.
When you need random numbers, some of the slurry is fed into a cryptographic hash function in order to extract a pseudorandom bitstream from it.
The amount of randomness extracted is never allowed to exceed the amount currently swilling around in the sludge-bucket: if necessary, your reads from /dev/random are slowed down until the bucket fills up again.
This stops an attacker diluting the sludginess of /dev/random simply by reading wastefully from it until the metaphorical water runs clear.
The question, in the light of recent implications that the NSA has tainted the cryptographic sanctity of everything it could get its tentacles into or onto, is whether it is acceptable for the Linux kernel to use random numbers generated by the CPU itself as part of its official pseudorandom stream.
After all, modern Intel CPUs have an instruction called RDRAND which is supposed to use thermal noise, generally considered an unpredictable byproduct of the fabric of physics itself, to generate high quality random numbers very swiftly.
Sounds like just the ticket, but what if Intel tainted RDRAND, by order of the NSA?
Linus’s school of thought, which is entirely understandable, is that mixing a tainted data stream with a pseudorandom one can’t reduce randomness: even if you stir your bucket of sludge in a really careful and ordered fashion, you still end up with sludge.
(I’m not sure I agree with Linus that mixing in a known-tainted RDRAND stream would nevertheless invariably improve randomness, but on the surface, it shouldn’t reduce it.)
Of course, you can counter that claim – and some concerned digerati have done just that – by postulating an actively hostile RDRAND instruction.
This RDRAND might monitor the state of the rest of the CPU in order to produce “random” data that is specially matched to the existing contents of the sludge-bucket so as to cancel out some of its randomness.
But how likely is that, given the cryptographic and non-cryptographic hash-churning that goes on inside the Linux kernel to stir in new pseudorandom input?
Can you “cancel out” randomess under those circumstances?
How much of the state of the CPU, and even the computer as a whole, would the tainted RDRAND instruction have to track in order to produce a real time active cancellation stream that could predictably tweak the overall output of /dev/random?
Well, here’s the thing.
If you take Linus’s advice, and go read drivers/char/random.c, as an interested spectator called Taylor Hornby did, you won’t find quite the clarity that the rant-master seems to suggest.
For example, the core function get_random_bytes() says that it “does not use the hw random number generator” (which would handily render this whole discussion moot), yet calls a function which does just that:
Furthermore, the hardware-generated random data (that the algorithm isn’t supposed to be using at all, remember?) is consumed after both the non-cryptographic and the cryptographic hash-churning described above.
The RDRAND data is merely XORed into the already-hashed output of the random number generator as the last step of the process.
In theory, then, a hostile RDRAND instruction wouldn’t need to keep track of much CPU state at all, since you can cancel out an XOR merely by repeating it. (X XOR X = 0; X XOR 0 = X; and so Y XOR X XOR X = Y.)
As Taylor Hornby notes, in a mock dialogue amusingly modelled on Galileo’s Dialogue Concerning the Two Chief World Systems:
Ironically, the random.c source code suggests that a tainted source of randomess is a problem – even at the stage when the bucket of sludge is still being filled, let alone after it has been drained – when it says:
So, if I were King, what would I do to sort this out?
- I’d order my subjects to stop worrying about a tainted RDRAND, at least for now, and concentrate on all the other problems in my Kingdom, such as IE 6, browser Java, unencrypted USB keys, XP’s forthcoming funeral, and sources of randomness that really are broken.
- I’d have some King’s Messengers fix the comments in random.c so that they matched the code, like good documentation should, and actually helped prove the Rude Man’s assertion that “we know what we are doing.”
- I’d fold in the data from RDRAND earlier in the process, along with all the other sources of entropy, so that no-one would need to answer the question, “Who asked you to leave RDRAND until an XOR right at the very end?”
- I’d sentence Mr Torvalds to 200 hours of community service in a hospital orthopaedic ward, helping those who can’t help themselves because of serious injuries sustained in automobile accidents.
Right now, we could do with a bit more clarity in cryptography.
Sadly, ranting that people should read a bunch of historically inconsistent comments in a source code file in order to conquer their ignorance is not a means to that end.
"This RDRAND migh[t] monitor the …"
Fixed, thanks!
Duck, I think this is one of your best. Really well done.
Aw, gee, thanks, Wally.
(So THAT'S where you are!)
I suppose it ended as a little bit of a rant on my part, but Linus is inflential and respected. The bloke is 43, not 13, so It's time for him to become more respectful, and to realise that you don't conquer poor learning with poor teaching.
Oh come on. Why are all these (usually US/UK) people getting recently soooo upset about what Linus says and how he says it? Is it a smudging campaign or something?
Guess what – kernel development is a meritocracy. If you know what you are doing, create a patch and submit it. If you don't have the know-how to do it, then learn or shut up and stop making stupid petitions instead of working code.
Linus is not being "abusive" when he just doesnt want to hear uninformed whining and excuses and crap designs or to spend time explaining the same thing over and over to a hundred different persons.
PS. If the RdRand is something you dislike, start the kernel with "nordrand" parameter!!
That PS is actually very handy! I think I'd still fold in the RDRAND randomness earlier.
(I assume it is done at the end because it's a non-blocking source and its randomness is presumed to be very good. But it would be nice to know that's why the coder did it.)
What a horrible, sensational article.
Was it the concept of "digital sludge" that put you off?
I think it would be important to get a response to this from the kernel devs involved.
Mixing in hardware source that late does smell a bit fishy in light of what's going on, and it's not a big ask to have an explanation of some of the design decisions involved.
Even if it isn't important, it would be a heck of a lot more useful to me than implying I'm ignorant because I can't get my head round the contradicitons in the code comments 🙂
While I would never participate (or hope to) in Torvald’s level of rudeness, he has justified it and I must say – I do get where he is coming from: http://marc.info/?l=linux-kernel&m=1373912237… – if you are nice to people doing bad things then bad things keep happening, basically. I see that exact thing every single day because some people really do need to be told to “f*** off and do it over”, but run-of-the-mill business ethics prevent them from being told that.
He has not justified it, merely stated why he does it – because he despises being nice.
And when he behaves like this, especially from his position of influence, he insults all of us.
Like I said above, the bloke is 43, not 13.
Sometimes, even when things are well set against you, yet you treat people doing bad things with respect, you astound them and influence them quite profoundly. N. Mandela, QED.
The petition author wasn't making a "suggestion"
It was a nasty bit of FUD, complete with header graphic insinuating that the kernel devs were in cahoots with the NSA.
In terms of rudeness, the response was proportional to the accusation.
How do you get a change into the kernel?
1a) Make a petition demanding (not suggesting) that something be done
1b) Provide no evidence, no solution. Don't even identify a problem, just spill verbal bile all over the place.
1c) Use a mocked-up image of tux next to the CIA HQ, with unsubstantiated subtitle text to drive home your agenda
OR
2a) Clone the public git tree
2b) Make changes to the kernel source, complete with comments and justification
2c) Subscribe to the LKML, and post your patch
What a pity that Linus chose to imply that the rest of us could learn why he was right from a document that shows no such thing. The imprecation to "read random.c and learn how it works" was a general suggestion…and that file is self-contradictory.
So why *does* the code go on at some length about how a non-cryptographic CRC is good enough *if the entropy data is non-malicious*, then state that the hardware random generator is not used, and finally mix in said hardware randomness after all the bit-mixing?
Two wrongs don't make a right, esepcially not when it comes to improving education and achieving clarity – I'll state that as an axiom.
Read the bloody code and not the comments mate. If you had done any development in a large long-lived project you'd know comments tend to rot over time…
Not in code written by people who know what they are doing 🙂
(And though I don't see a conspiracy theory in this, I still don't know why the comments, and Linus's rant, talk pretty unambiguously about mixing in RDRAND together with lots of entropy sources, when in fact RDRAND's output is mixed in separately. Since you seem to be an expert in "comment rot," perhaps you can essay an explanation?)
I read something about this the other day and didn’t understand any of it. now, thanks to you, I understand it a lot better. that was an incredibly clear explanation of a very complicated ( atleast for non-kernel people like myself ) topic. thank you very much.
You are most welcome. (And thanks for the kind words.)
When you say “invented” you mean “re-implemented”.
He didn’t “invent” anything, that was Ritchie, Thompson, Pike, Mountjoy, Presotto, Duff and other actual inventors.
Err…[looks around sheepishly]…I was being ironic. (From what I have read, the name Linux was actually invented by one of Linus's chums. He was planning to call it Freax, or something/)
…or had they been in Britain; Colinux
Don't you think your change.org thing was awkward?
Linus Torvalds is only saying out loud about people, what information security professionals are thinkiing about people; like all the time…
Guilty.
So, did you execute your proposal No. 2 & 3? Or is there anybody executing it?
I didn't. I'm not King yet.
The slaves can propose changes (i. e. full changeset), too, with decency and a "hail the chief" ;-). If it serves a greater purpose (security, code clarity), one may need to be deferent and sidestep one's pride.
See this comment, 2.a/b/c): http://nakedsecurity.sophos.com/2013/09/11/rudest…
Why would I propose changes? Linus knows what he's doing. He said so.
Well, I'm sorry, but that's childish to say. "He was mean, so I don't do shit!" If you truly believe there's something wrong, you should fix it, no matter what! Keep the arguments simple and technical. Don't indicate that your reasoning is based on conspiracy theories (or anything like that) if that's the case; only make changes based on technical judgement (like in this blog post) – think and reason like a machine would. Don't make big changes, propose just enough changes so that your intentions are clear. Also, base your changes on what Linus said to be the case already: Point (without blaming anyone) and sort out the inconsistencies between comments and code, move the RDRAND higher up the logic (with the reasoning you gave in this blog post – and say that it even may not increase security, but it decreases the likelyhood of an issue to a small fraction). Don't use emotions, even if someone else does; ignore them, that's what grownups are supposed to do even if some violate the rules: Be a better man.
Interesting, but I still don't agree. The fact that the XOR operation can be easily voided is true, but this is not changing the point. Putting it as the last source of XOR-red numbers means that, after having a good random, you add something which COULD be voided after XOR-ing, (true), returning back the initial good random.
So, I agree on your premises, like the XOR statement or the bad documentation, while I can't see how this last source of XOR-ed numbers can impact: in the worst case, to add this source is simply giving you no advantage: who could predict it could erase the last XOR in line, but this is not breaking the quality of the previous sources.
For the sake of completeness I could ask Linus why to keep in the kernel such a source, since of cloud, virtualization, and so on, which is going to execute "something" at the place of the CPU instruction , somewhere (like in the cloud), but this is not something you can say "now NSA reads your random numbers". You can say "you wasted cpu cycles to create useless cypher", nothing more.
I think the point made by the conspiracy theorists is that if you know that the RDRAND output is the last thing XORed into the whole chain of calculation, and if you happen to have in a known spot in the cache (or even some other register) the value that RDRAND is going to be XORed with, then you can just make RDRAND return that very value instead, and your final random number will be zero every time.
That wouldn't be possible if the RDRAND entropy were mixed in earlier…
And you could very easily check that the final PRNG output has become zero.
Or you could output the supposedly contaminated final PRNG stream and analyze it offline thereby easily finding the problem.
Finding flaws in PRNGs is fairly tricky. You can readily see if they are bad, but not so easily confirm they're good.
As (as Taylor Hornby notes) if you are going to Trojanise RDRAND to that extent, you could make the dodgy behaviour contingent on some other factor, such as the appearance of a specific bit pattern in a register or in the cache, so there was even an element of remote control.
Given knowledge of the slurry XOR operand and control over RDRAND, you can make the result have any value you choose, as you suggested. I think the likely objective, then, would be to cause generation of a specific pseudo random sequence.
A pseudo random sequence, after all, looks very much like real entropy, and, hence, it would be very difficult for us to notice we'd been had. As the attacker, however, you know the pseudo random sequence very well, and, thus, your task in decrypting our communications is significantly lessened.
I have no idea of the practicalities. Processor designs are so vast, with gate numbers is the billions, I can't believe anyone can comprehend them. So, getting a rogue pseudo random generator under my nose wouldn't be that hard!
Perhaps reading the value from RDRAND before processing the slurry for one affords sufficient hardening. I like the simple XOR as (and you should check with someone who truly knows what he's talking about, before relying upon this) A XOR B, for independent A and B, cannot possibly be less random than A is or than B is.
Aaaaaaargh, my veckin’ head!!! Just make an animated video, PLEASE!
"Ironically, the random.c source code suggests that a tainted source of randomess is a problem …"
RAND-o-mess – what a delicious Freudian slip …
Cheers.
Hmmm. I think I shall leave that one 🙂
Gee, open source documentation is neither accurate nor up to date…
What a surprise… 🙂
Also, I suspect it's a stretch to say Linus was referring to "us" when he told this particular individual to go back and RTFM… He was referring to people who assumed that he and the other kernel developers were incompetent and weren't handling security properly.
Now, it may well be that one or more kernel developers ARE ignorant and incompetent in security. But it seems to be that no one – including this article – has established that what has been done in this case has been incompetent – aside from lousy documentation – which is a problem endemic to the entire software industry if my thirty-year experience has any relevance.
In short, another tempest in a teapot based solely on Linus' well-known and amusing rudeness to people less competent than he…
Considering that Linus did not invent much of anything, just re-implemented and popularized an already started kernel, I think his competence is overrated. But by being rude he will manage to keep people from questioning that, especially in complex hard-to-prove concerns such as cryptography in a poorly documented open source program.
YES! And the truth say set you free! Linus did *NOT* invent linux, and his contribution of the kernel and promotion / bundling of the gnu sources with the “his” kernel was subject to much underlying research from others. I don’t think even he thinks he invented linux! While some of his frustrations I can at least relate to (regarding hardware vendors), his is for the large part a rather hyper-opinionated git. Pun intended. Look at the contributions of Stalman et all… argh… even Gates and Jobs… Linus is a talent but sadly seems most famous for his mouth.
The author of the article does not seem to understand the nature of XOR.
If you XOR random data with organised data, you get random data
Errr, the issue is whether that data you're XORing with is random, or what you might call "anti-random".
If your "organised data" has a specific relationship under XOR with the random data, then you don't get random data back, you get organised data.
E.g. if the "organised data" is *organised so it is identical to the random data*, then RND XOR ORG is juns RND XOR RND, wich is a string of zeros.