Intel is adding two new exploit detection systems into its forthcoming processors.
The new technology has been at least four years in the making, according the chip giant’s recently updated specification document, which contains a “version 1.0” release date of June 2016.
Intel’s PR machine has been making waves about the system, known as CET for short, or Control-flow Enforcement Technology in full, for a while…
…and now it’s officially out for you to take a look at. (Warning: the specification document runs to 358 pages.)
As far as we can see, the first wave of Intel processors that will include these new protections are the not-quite-out-yet CPUs known by the nickname “Tiger Lake”, so if you’re a programmer you can’t actually start tinkering with the CET features just yet.
Nevertheless, CET reminds us all that computer security is a cat-and-mouse game, where one round of security improvements provokes a change in behaviour by cybercrminals, which in turn leads to a new wave of defences, and so on.
Loosely speaking – very loosely, given that we’re summarising a 358-page document – CET aims to make remote code execution exploits harder than they are now by keeping a tighter rein on how programs behave.
More precisely, CET aims to keep an eye out for how programs misbehave, so that it’s easier to detect when a program has crashed, and therefore to stop crooks coming up with sneaky ways of crashing-yet-keeping-control-over buggy programs.
Exploiting memory errors
Errors in using memory are one of the leading causes of software bugs that lead to security holes, known in the trade as vulnerabilities.
For example, if I ask the operating system for 64 bytes worth of temporary storage, for example to generate and store a cryptographic key, but then accidentally save 128 bytes of random data into it, I’ll trample all over whatever comes next in memory.
A memory block that’s allocated for your own use is known colloquially as a buffer, so writing outside your own buffer and into someone else’s is known as a buffer overflow.
Another way that data commonly gets trampled is known as use after free, where I accidentally save data into a block of memory that I already told the operating system I didn’t need any more, and that therefore might already have been handed out to be used somewhere else.
Even if I carefully write my limit of 64 bytes and avoid a buffer overflow error, I’m still writing where I shouldn’t.
So even though a use-after-free bug isn’t technically referred to as an overflow, you can think of it that way because I am writing 64 bytes to a buffer where I am currently supposed to write no bytes at all.
Memory safety bugs, as they’re called in general, are an obvious cybersecurity risk because they mean that an attacker might be able to craftily alter data that some other part of the program assumes it can trust and therefore later relies upon.
The danger posed by a memory error of this sort depends, of course, on what got trampled.
If the memory bytes that were overwritten contained an error message that only ever gets printed under highly unusual circumstances, then the bug might not be noticed for years, and even if it shows up, the only bad side effects might be to cause an error to go unreported (or be reported incomprehensibly).
But if the memory that got trampled contains any data that the software later relies upon to control the flow of execution in the program, then an attacker can very possibly find a way to abuse that bug to implant malware.
Defending against memory bugs
There are two main ways that memory overwrite bugs can be exploited to divert execution.
One relies on modifing what’s known as the stack, a block of memory that the CPU uses (amongst other things) to keep track of subroutine calls in software.
When you call a program subroutine, for example
getch(), which reads in the next input character, usually from the keyboard, the processor keeps track of where you
CALLed it from so that the subroutine can simply run a
RETurn instruction to get back where it was before, to the next instruction after the
So, if you can mess with the stack, you can often mess with the next
RET instruction so the program doesn’t go back where it came from but instead heads off into unauthorised territory of your choice.
Another sort of bug involves modifying the memory location used by a
CALL instruction to tell it where to go next – instead of diverting a program when it returns from a subroutine, you divert it when it tries to call or jump to one.
Various protections already exist agains this type of trick, notably DEP and ASLR.
DEP stands for Data Execution Prevention and it assumes that when attackers modify a
RETurn address, or a
JMP destination, they’ll need to divert execution to a chunk of code – known as shellcode – that they supplied themselves, typically as part of the data they sent to the errant program in the first place.
But modern CPUs can flag data buffers as “not for execution”, which prevents shellcode supplied as data from running even if attackers manage to
CALL to it.
Crooks responded to DEP by using two-stage shellcodes where the first part relies on stringing together code fragments already loaded into memory, for example as part of the running program or one of the DLL files it uses.
These “already executable” fragments, known in the jargon as gadgets, don’t need to do a lot – typically, they’ll just tell the operating system to switch the buffer where the rest of the shellcode resides from “no execution allowed” to “this data is allowed to run as code”.
Then, simply jumping to the second part of the shellcode completes the takeover.
(Note that the gadgets were never intended to be used in this way – the crooks typically comb through system DLLs and hunt for byte sequences that just happen to decompile to useful code snippets such as
ADD THIS or
COMPARE THAT, even if the gadgets are themselves part of other instruction sequences.)
Of course, to misdirect a running program so it transfers control to an “already executable” gadget, the attacker needs to know what memory addresses those gadget bytes are loaded at.
Fifteen years or so ago, that was trivial because every version of Windows loaded its standard set of system DLLs at the same memory addresses every time, so if the crooks could figure out an exploit that knew where to weave around in memory on their test computer…
…it would work on your computer, too, assuming you had the same version of Windows.
ASLR, short for address space layout randomisation, made that much harder, because Windows, and all other mainstream operating systems, now load programs at different locations every time you reboot.
The crooks can easily guess which Windows version you have, but they can’t easily guess which gadgets are at what memory addresses on your computer.
ASLR still not perfect
One problem with ASLR is that if attackers can somehow figure out the memory addresses that are being used on your computer right now, even though they were randomly chosen, they can modify their attack automatically simply by adjusting all gadget addresses in their exploit to suit.
Unfortunately, information about system memory allocation sometimes leaks out due to other, innocent sounding bugs known as information disclosure flaws.
For example, some programs write log files that are intended to be helpful if ever you need support, accidentally including useful but supposed-to-be-secret data such as
System version data found at address 0x7DEE.... or
KERNEL DLL loaded at 0x7EE3.....
In other words, the memory layout information that crooks aren’t supposed to be able to figure out for program X might already have been blurted out by program Y.
Intel’s new hardware solution aims to go beyond ASLR and takes two forms, called the shadow stack and indirect branch tracking (IBT).
The implementation is complex but the concepts are simple:
- The shadow stack will keep two copies of every memory address that a subroutine might
RETurn to. One will be stored where it always was, still vulnerable to buffer overflows. The other return address will be saved on the shadow stack, where a buffer overflow can’t (or isn’t supposed to be able to) reach it. Whenever a subroutine tries to
RETurn, the two stacks will be compared. If they differ, the return address on the regular stack must have been modified incorrectly. In theory, this will detect and prevent both accidental crashes and deliberate exploit attempts.
- The IBT system will introduce a new machine code instruction called
ENDBRANCH. Programs that want to make use of IBT can compile these instructions into their code at every point where a
CALLis permitted to arrive – creating an allowlist, if you like, of legitimate branch targets. Any
CALLthat’s modified to end up somewhere else, such as at a “code gadget” picked by an attacker, can be detected and blocked. Crooks should therefore find it somewhere between very hard and impossible to find code gadgets that do what they want.
In case you’re wondering how IBT will work in a backwards compatible way, Intel carefully chose an instruction bytecode for
ENDBRANCH that executes as a
NOP, short for “no operation” (i.e. an instruction that does nothing except use up a tiny amount of time and memory) on older CPUs.
So software recompiled for CET-capable processors in the next year or so will still work correctly on older computers.
Is this the end of exploits?
As Intel’s own press release points out, “No product or component can be absolutely secure. Your costs and results may vary.”
Having said that, we suspect that CET will, in general, make things harder for the crooks, so we look forward to it being more widely available.