Remember row hammering?
It’s an old and well-known problem with computer memory – the sort of memory known as dynamic RAM, or DRAM for short.
DRAM is constructed as a silicon chip consisting of a tightly packed grid of minuscule storage capacitors arranged in electrically connected rows and columns.
Greatly simplified, row hammering means reading the same DRAM memory addresses over and over again, concentrating electronic activity in one tiny part of the chip for sufficiently long to interfere with nearby memory cells.
From time to time, some of those nearby cells may change their electrical charge, flipping them from 0 to 1 or from 1 to 0.
LEARN MORE ABOUT ROW HAMMERING
Concerns over row hammering have led to a series of recent changes and patches in most contemporary operating systems and commonly used apps, notably browsers.
These changes have made it harder and harder to cause bit-flips at all, let alone to provoke them at will in an exploitable way.
Well, sort of.
Dutch researchers at the Vrije Universiteit in Amsterdam noted that most of the mitigations against row hammering had focused on the interaction between your device’s CPU and its RAM.
But modern devices don’t just have a CPU, they typically have a range of auxiliary processors, too, notably including one or more GPUs, or graphics processing units.
GPUs are devoted to accelerating the sort of mathematical and bit-twiddling operations that graphics-intensive apps demand.
A journey of many steps
The researchers decided to see if they could use code running on the GPU in an Android device to pull off row hammering tricks that wouldn’t be possible via traditional programming techniques.
To make the problem even more specific and interesting, they also wanted to see if they could do all of this without requiring a rooted Android, and without relying on an already-installed malware app.
They gave their research the trendy name
GLitch, where the letters GL come from WebGL, short for Web Graphics Library.
The GLitchers assumed that WebGL’s added features would bring added risks, so that’s what they went looking for.
…but it was a journey of many steps.
First, bypass the cache
For row hammering, however, you need precise control, where “read really means read”, forcing your program code to access the the actual silicon in the DRAM chip itself.
That’s harder than it sounds, because modern computers try to speed things up by sidestepping actual DRAM reads as often as possible by storing commonly-used values in special fast storage locations called cache memory.
In contrast, for row hammering to work, you need to create plenty of electronic load on the DRAM circuitry, which means reading the same physical memory area over and over again as fast as you can – without the cache trying to “improve” your performance.
On ARM chipsets, commonly used in mobile devices, it’s possible to empty the cache in order to remove its behaviour from the equation, but regular apps can’t do this – you have to be the Android kernel, or to have a rooted phone.
The Vrije Universiteit team, however, figured out that the GPU memory caching algorithm in the chipset they used in their research was easy to predict.
By accessing memory in a well-defined pattern, they could effectively clog the cache so that it no longer got in the way.
Second, keep track of time
To do row hammering, you need to figure out which memory addresses live where in the silicon, because you’re relying on concentrating your memory reads on one tiny part of the chip in the hope of interfering with the capacitor circuitry nearby.
DRAM reads happen in bursts of adjacent bits, rather than one bit at a time, so you can tell when you’ve just read from two addresses that are physically close by, because the two reads can be completed in one burst.
That makes the reads happen a tiny bit faster than when you access two addresses that are far apart on the chip.
But to map out memory this way, you need to be able to keep track of time with astonishing precision – we’re talking about measurements down to nanoseconds, not just microseconds.
To picture how a nanosecond matches up to modern computer speeds, remember that 1GHz is shorthand for “one billion of whateveritis per second”, which means “one billionth of second each”, and that one billionth of a second is a nanosecond (10-9 seconds). Even though a microsecond is one millionth of a second (10-6 seconds), thousands of machine code instructions can run in that time.
These purposely inexact timers are implemented so that they are accurate enough for general use, but not precise enough for row hammering trickery.
But our intrepid researchers found a pair of timing functions specific to WebGL that hadn’t yet had their accuracy “smudged” for security purposes.
Thanks to the GLitch paper, browser makers are now deliberately reducing the accuracy of those timers, too (
TIMESTAMP_EXT), but the researchers also found other ways to write WebGL code that they claim provided the precision that they needed without using any special timer functions.
Third, map out the DRAM chip
If you can bypass the cache to perform “real” memory accesses, and you can time those accesses with sufficient precision, you’re in a position to map out the DRAM chip.
You don’t need to construct a detailed layout of the whole memory space – it’s sufficient to figure out when you have three physically adjacent rows of DRAM capacitors.
With access to three contiguous rows of capacitors in the chip, you can repeatedly and rapidly read data out of the outer two rows, creating sufficient electrical activity to give you a good chance of flipping one or more bits in the row of capacitors sitting in the middle.
This is called “double-sided row hammering”, for obvious reasons.
Fourth, figure out the memory allocator
Getting the operating system to dish out memory corresponding to three adjacent DRAM rows isn’t as simple as asking for three identically sized memory blocks, one after the other.
In fact, with the Android memory allocator used to support the GPU that the researchers were targeting, three memory allocations in a row didn’t produce adjacent memory blocks at all.
But by studying the allocator, the researchers figured out how to construct a mixture of allocations and deallocations so that they reliably ended up with memory dished out from adjacent rows of capacitors inside the DRAM itself.
Once they had three adjacent rows of DRAM real estate allocated, plus high-speed direct read access to that physical memory, they had a “hammerable row” lined up that they could subject to an electronic pummelling in the hope of deliberately corrupting it.
Still not enough…
The power to corrupt memory at will, even if it’s only a single bit in a quasi-random location, can always be considered an exploit – at the very least, you could force an app or even the whole device to crash, thus causing a denial of service attack (DoS).
But the GLitchers went further than just a browser-driven DoS attack.
That means not only data leakage by reading from memory that’s supposed to be private, but also the possibility of remote code execution (RCE) by poking machine code into protected memory and then running it.
What to do?
Previous row hammering attacks were often considered irrelevant on mobile devices.
Either you needed to install an app that was already authorised to pull off the very sort of attack that row hammering might help you to achieve, or you needed to wait ages to have any hope of success, during which time other activity on the device would probably disrupt the attack and send you back to square one.
Does that mean Android is broken and you should stop using it?
So far, the researchers have a limited set of attacks that work under controlled circumstances on an outdated device of their own choosing, running an old version of Android.
Nevertheless, GLitch reminds us is that when you add features and performance – whether that’s building GPUs into mobile phone chips, or adding fancy graphics programming libraries into browsers – you run the risk of reducing security at the same time.
If that happens, IT’S OK TO BACK OFF A BIT, deliberately reducing performance to raise security back to acceptable levels.
Chip image courtesy of https://zeptobars.com/en/contacts.