Notes from SophosLabs: On the trail of rootkits and other malware

This is the first in an occasional series looking at some of the techniques we use in SophosLabs to help us take malware apart.

We hope you enjoy this article – if there are any topics you’d like us to cover in future articles, please let us know!

Many thanks to Mike Wood of SophosLabs in Vancouver for his behind-the-scenes effort that made this article possible.

On the trail of rootkits and other malware

When an interesting new piece of malware makes the news, the first questions people ask are usually, “How does it work? What does it do?”

In the old days, back when there were no more than a few hundred new viruses each year, almost all of them written in assembly language, we’d often start with a static, analytical approach by disassembling or decompiling the machine code itself.

Once we knew what sequence of operations the malware performed – for example, that it scanned through the directories on the C: drive and appended itself to every .COM file – we would then run the malware on a freshly-prepared computer and confirm our analysis using a dynamic, deductive approach.

But these days there are hundreds of thousands of new malware samples every day, written in a variety of programming languages, and delivered in a variety of ways.

The vast majority of the samples we get aren’t truly new, of course.

They’re unique only in the strictly technical sense that they consist of a sequence of bytes that we haven’t encountered before, in the same way that Good morning and GOOD MORNING are not literally the same.

Indeed, most of the new samples that show up each day are merely minor variants that we already detect, or known malware that has been encrypted or packaged differently.

Nevertheless, that still leaves plenty of samples worth looking at.

So, these days we usually start dynamically and deductively, using automated systems that run the malware in a controlled environment, instead of first trying to deconstruct each new sample by hand, like we did in the 1980s.

And that leaves us with the questions behind the questions that we asked at the start, namely, “How do you tell how it works? How do you keep track of what it does?”

On the trail

Common monitoring techniques when you are following the scent of a suspicious program include:


Take a “before snapshot” that records the state of the system, for example including the names of all the files (and their checksums), and the contents of the registry, and store it somewhere safe.

Run the malware.

Take an “after snapshot” and compare it with the first.

System call tracing

Keep track of system calls, such as the self-explanatory CreateFile(), CreateProcess() or URLDownloadToFile(), and record the parameters that were used.

The snapshotting technique tells us how things ended up, and the tracing technique tells us how we got there.

For example, the snapshot can pinpoint files downloaded by the malware, and the trace can identify where they were downloaded from.

But relying only on snapshotting and tracing can leave gaps in our understanding of a malware sample.

Potential problems include:

  • Noise. A new file that shows up as a single item in a snapshot might be created by hundreds of thousands of one-byte-at-a-time calls to WriteFile().
  • Timing. How long should we wait between snapshots? Too long, and the malware might have been and gone; too soon and it might still be waiting for a malicious download to start.
  • Certainty. Because we are taking our measurements inside the operating system, we run the risk that the malware might deliberately feed us incorrect or diversionary results.

Most importantly, how do we tell if malware does really sneaky things, such as installing a rootkit, writing to unused parts of the disk via system calls that we aren’t monitoring, or using undocumented features or exploits?

Using virtualisation

Virtualisation can help here.

Unless we are dealing with malware that deliberately behaves differently when we run it inside a virtualised environment (e.g. VMWare, Xen, VirtualBox) we rarely use “bare metal” computers with the malware running directly on a real computer.

Virtual machines, which are effectively software computers, have many advantages, notably:

  • One physical computer can contain many different starting images for trying out malware.
  • Multiple malware samples can be analysed simultaneously by running multiple virtual machines.
  • Virtual disk images are stored as regular files and can easily be backed up and restored.

The last item turns out to be especially useful in looking out for changes, because we can compare the state of a disk image before and after the malware is run.

The comparison happens from the host computer itself, when the virtual machine is frozen or stopped, so it can’t be tricked by the malware hiding itself by feeding us bogus results. (This behaviour is jocularly known as stealth or anti-anti-virus.)

Effectively, we end up with a sector-level snapshot of everything that changed in the virtual disk image, including changes that might not show up in a conventional snapshot.

That includes data written to temporary files, the swap file, the disk’s boot and partition sectors and even to officially-unused parts of the disk.

That’s a trick that some rootkits use to great effect: they implement a proprietary filing system, hidden in empty sectors on the disk, in which they can store programs, data, and configuration files that are as good as invisible to the operating system.

So a sector-level record of what changed on the disk, and where, is a good way of counter-attacking the malware, because changes outside the remit of the operating system show up clearly, and can immediately be flagged as suspicious.

Speeding things up

But one problem with sector-level snapshotting is that looking for changes between the “before” and the “after” images can be time-consuming.

A virtual disk image of a basic Windows 8.1 install, for example, weighs in at 8GB or more, so checking every sector in the “after” file against every sector in the “before” file means reading at least 2 x 8GB’s worth of raw data, even if only a handful of sectors have changed.

However, there is a handy shortcut that we can use.

Most virtualisation systems include a snapshotting feature of their own, also known as disk differencing, to make it easy to undo any changes after running a virtual machine for a while.

This is very handy when you are testing new software, or analysing malware.

Instead of writing changes back to the master disk image, a separate “difference image” is used to store changes.

When reading back in from the disk, the virtualisation software checks to see if the needed sector is in the difference image first, only reading from the master image if it is not.

In other words, if we run our malware inside a virtual machine that is in differencing mode, we automatically end up with a list of what changed, and where; and when we examine the differences, we can’t be tricked by any self-protection or stealth features built into the malware.

Of course, the difference image itself only tells us which sectors have changed, so we still have to work out for ourselves which files those sectors belong to, but changes that don’t belong to files (for example because they are part of a rootkit that works outside the operating system), stand out at once.

What the changes tell us

The most obvious benefit of tracking malware-related disk changes with difference images is the ease and reliability of spotting what we might call unauthorised disk modifications, such as those made by a low-level rootkits.

But we can use difference images to track file level changes, too.

By working backwards from the difference image, through the NTFS file allocation table (called the MFT, or Master File Table), we can quickly work out which file “owns” each modified chunk of the disk.

That gives us a rapid list of what changed without processing the entire virtualised master disk image.

If any of the changed objects look to be of interest, we can then extract them directly from the virtual disk image files for further analysis.

This may sound like a lot of work compared to simply mounting the virtual disk images and scanning through their directory listings, as we would in a regular “before” and “after” snapshot system.

Indeed, at worst, on a computer where every file changed while the malware was running, the differencing image might end up as big as the master image.

But, in practice, if we time our snapshots carefully to minimise the amount of change while the malware is running (remember that the difference image records all changes, including uninteresting and unimportant ones), analysing the changes this way can be significantly faster than traditional techniques.

In SophosLabs, the speed improvement we have measured is around 60-fold, so that what used to take a minute now takes a second.

So this is a nice example of how we can work smarter and faster at the same time!

Find and kill rootkits with the free Sophos Virus Removal Tool

This is a simple and straightforward tool for Windows users. It works alongside your existing anti-virus to find and get rid of any threats lurking on your computer, including rootkits and other stealthy malware.

It does its job without requiring you to uninstall your incumbent product first. (Removing your main anti-virus just when you are concerned about infection is risky in its own right.)

Download and run it, wait for it to grab the very latest updates from Sophos, and then let it scan through memory and your hard disk. If it finds any threats, you can click a button to clean them up.

Click to go to download page...