OpenSSH, perhaps the most widely-used remote access security system on the internet, has just patched a possible remote code execution bug.
The flaw was discovered on 07 November 2013 by an OpenSSH developer, and the fix was announced and published the next day.
SSH stands for secure shell, where the term shell is UNIX-speak for a command prompt, and SSH is indeed commonly used for remote access to a terminal-style login.
But because SSH actually creates a general-purpose encrypted data channel – what’s often called a secure tunnel – between two computers on the internet, it’s used for much more than just shell logins.
Notable uses include secure file transfer between servers, and secure data synchronisation between data centres.
And since OpenSSH is by far the most widely used implementation of SSH, potential remote code execution bugs in it are the stuff of nightmares for system administrators.
So we have to congratulate the OpenSSH team for reacting so quickly, not least because the project, in its own words, has “no wealthy sponsors, nor a business model.”
Their advisory, as is commendably common from the open source community, also deals succinctly with the very questions that an inquisitive administrator might want to ask, such as:
- What caused the bug.
- Why it should be considered an RCE flaw.
- How likely it is to be exploitable.
- What you need to do to fix it.
- How much was changed in the software.
- How to fix older versions.
And that begs the questions, for those who aren’t network security specialists or sysadmins themselves, “What went wrong?”
Very briefly – and I hope the OpenSSH guys will forgive me if they think I have oversimplified – the OpenSSH code supports a range of different algorithms for encryption (that’s what keeps the data secure as it traverses the internet) and for message authentication (that’s what keeps the data correct and unmodified in transit).
As part of setting up the data structures needed to open a new secure channel, memory was allocated for the functions of encrypting and authenticating, and this memory included space for what are known as callbacks – run-time specified program code that will be triggered when something of interest happens.
In C, a callback is basically a function pointer: a data variable that gives the program a memory address to which it should send control to perform a specific task.
Clearly, if a remote attacker can tweak the content of a callback variable, then when the callback happens, the attacker might be able to divert program execution into his own code, and thereby take over your system.
The issue here is that OpenSSH keeps a callback – a function pointer – used for finalising and cleaning up the message authentication algorithm in use.
Usually, when you open an SSH connection, you specify the algorithms to use for encryption and authentication, and OpenSSH initialises all the needed data structures for them, including filling in the addresses of the callbacks that let the algorithms do their work.
But some modern encryption algorithms provide encryption and authentication all wrapped into one, on the grounds that anyone who is serious about security online wouldn’t do the former without the latter.
One example is AES-GCM, short for Advanced Encryption System – Galois/Counter Mode. (Explaining AES-GCM is definitely an article for another time; what matters here is that using it means authentication is covered without using a second algorithm.)
So, when the vulnerable version of OpenSSH sets up an AES-GCM connection, it allocates memory for the authentication algorithm, but never initialises it – the memory just contains whatever was there from before.
You may be able to guess what comes next: when the non-existent authentication algorithm is cleaned up, the non-existent cleanup callback is invoked, using as a function pointer the value that was previously in memory.
The fix was clean and simple, and the bug no doubt an embarrassing oversight by the security conscious OpenSSH coders.
Change this line of coode:
newkey = xmalloc(sizeof(*newkey));
newkey = xcalloc(1, sizeof(*newkey));
The function malloc(), or derivatives like the call to xmalloc() you see above, instructs the system to reserve memory for your program.
A call to calloc() does exactly the same, but then fills the newly-dished-out memory with zeros.
With malloc(), depending on how it is implemented, you might end up with second-hand detritus from previous use of the memory; with calloc(), you start with a clean block of memory that clearly annotates itself to say, “This memory has not yet been initialised for use.”
In fact, if you’re a C programmer, make it a habit always to use calloc(), not malloc(), unless there are good reasons for preferring the latter.
→ If it is clear in your code that you correctly initalise the memory yourself immediately after allocating it, then using malloc() is faster because the memory won’t be written to twice. But if in doubt, just use calloc(). All other things being equal, go for extra security over extra speed.
Indeed, while fixing the known-bad use of malloc() in handling AES-GCM, the OpenSSH coders have taken the opportunity to change xmalloc() to xcalloc() in 20 other places in the code, for a spot of programmatic proactivity.
In conlcusion, we’ll point out that this bug is potentially exploitable, but the risk of an working exploit actually being created must be considered very low.
As OpenSSH’s own advisory points out:
This vulnerability is mitigated by the difficulty of pre-loading the heap with a useful callback address and by any platform address-space layout randomisation applied to sshd and the shared libraries it depends upon.
Still, if you use OpenSSH anywhere in your network – and you probably do – you might as well grab the latest version.
Consider your own rapid response to be a way of rewarding the The OpenSSH guys for fixing and documenting this bug pretty jolly quickly.