A number of friends, acquaintances and readers have asked me recently about “the recent Frankenstein virus research paper thing.”
Bankrolled at least in part by the US Air Force, and openly touted by its authors as “a powerful tool for active defense (e.g., offensive cyber-operations),” this internet-era Modern Prometheus story has been widely covered in the technology media, often with a degree of admiration bordering on breathlessness.
When I first heard the full title of the paper, Frankenstein: Stitching Malware from Benign Binaries, and realised that its goal was to come up with a strategy for deliberately creating malware that is harder to detect, my gut reaction was, “We don’t need it, it won’t work anyway, but it’ll make catchy headlines.”
Just how reasonable were my visceral and unscientific conclusions?
Very briefly explained, the authors, Vishwath Mohan and Kevin W. Hamlen, describe a mechanism for constructing malicious programs entirely out of code sequences which already appear in legitimate software installed on the victim’s computer.
Of course, this raises the question, “Why bother?”
The authors offer an opinion on that matter, saying, “By creating new copies entirely from byte sequences obtained from benign files, we argue that it becomes significantly more difficult for defenders to infer adequate signatures that reliably distinguish malware from non-malware on victim systems.”
In less orotund words, “Because it makes the malware harder to detect.”
One problem I see with the Frankenstein approach – aside from my understandable objection to anything which makes harder malware easier – is that there is no reason to assume that this Frankenware will be harder to detect simply because it consists of byte sequences which already appear on the victim’s computer.
In fact, its inevitably synthetic construction – knitted together as it is from bits and pieces plundered from a completely different context – may paradoxically make it easier to spot.
The object code in Frankenware, you can reasonably assume, will inevitably be of an unnatural and unusual form. It may work, but it will be weird. On that basis alone, you can envisage a practical and effective way to detect it, even without knowing or caring what it does.
And this highlights the second problem I see with the the Frankenstein approach.
Frankenware presumes that anti-virus software relies almost entirely on the static recognition of known code sequences without regard for their order or the context in which they appear, and without considering the overall effects of that code. It assumes, simplistically, that some permutation of a known-good program will, ipso facto, itself look good.
That’s not correct, as far as I can see, any more than a statement which only uses words which from the Unanimous Declaration of the Thirteen United States of America will inescapably read like a unilateral assertion of political independence.
Mohan and Hamlen have based their work on the concept of gadgets, which form the basis of Return Oriented Programming (ROP).
In ROP, you rely on finding code sequences already in memory on your victim’s computer and knitting them into a meaningful whole. You don’t so much care what the gadgets are, but where they are – specifically, that they are already loaded into memory pages marked as executable.
You don’t use the gadgets as a means of looking normal, but as a way of behaving normally.
So, borrowing the overall concept of gadgets was an interesting idea for Frankenware, but in the end, a fruitless one, because gadgets are neither necessary nor sufficient for creating hard-to-find malware.
In summary:
- Gadget-built code doesn’t look more natural just because it’s built from strings already on your hard disk.
- Gadget-built code isn’t harder to detect simply because the strings it uses can be found in legitimate programs.
- The world doesn’t need more malware, and it certainly doesn’t need more malware construction tools.
In short, is Frankenware which vaguely resembles existing legitimate software, in the same way that a patchwork quilt resembles a woven blanket, likely to be the basis of undetectable malware in a future cyberwar?
I doubt it.
Malware which, at first sight, is legitimate sofware (because it’s digitally signed by someone you trust, or delivered officially by your service provider, for example) is in my opinion a much more serious concern.
Then again, Mohan and Hamlen had a paper accepted at the USENIX 2012 Workshop on Offensive Technology, plus a trip to Washington to present it, and I did not.
–
I don’t get it, what’s the difference between a Frankenstein virus and a file infecting virus that has been here for a long time.
I mean file infecting viruses infect legit files on the computer and what’s the difference between the two?
The idea here is that the actual code of the malware – whether it's a parasitic (file infecting) virus, a worm (self-contained virus) or Trojan (non-self-replicating malware) – is subjected to a "make-it-look-legitimate" transformation.
So each sample of the malware code (whether parasitic or standalone) can be made different, but in a natural-looking way.
It's not just whether you infect an existing legitimate file, but whether your infected file still looks legitimate overall, even after it's been infected.
Ummm… Wouldn’t the code that constructs the malware from the legitimate strings be a suitable detection target? This really sounds like researchers desperate for something to publish to advance their careers or cover a thesis requirement.
Yes, the Frankenstein "morph my virus using pre-existing strings" code (remember that Frankenstein was the scientist who made the monster, not the monster himself) would indeed be a detection target. So too would its run-time behaviour. So too would both the static and the dynamic behaviour of anything it created. Etc.
Of course, it needn't be delivered with the malware (you could have it "in the cloud" on a server, just like the crooks already do today with so-called "server-side polymorphic" malwar"), though then it wouldn't strictly be scanning the user's computer for existing code sequences. It would have to guess what was on your PC, which rather defeats the purpose of much of the research.
Having said that, you'd use the morpher to morph itself every time you produced a new malware sample, so it too would be a moving detection target.
I don't want to be drawn on the whys of the research. I'll assume the researchers thought it was a good idea at the time – and it did get them to a prestigious conference, after all – but I don't share their enthusiasm for its value.
But that's part of the rough-and-tumble of research, is it not? You hear about "gadgets". You decide they sound like a decent hammer. You go looking for some nails. You try to bash them into the wall with your new-found hammer. You realise a screwdriver would have been better. You try something else.
If I were to be kind to the researchers, I'd say, "You can't make an omelette without breaking a few eggs." If I wanted to be cynical, I'd say, "Who wants omelettes?"