Yuletide PDF gymnastics

Whilst browsing some reports yesterday, I noticed an unexpected detection at the top of the charts. Over the past few days, Troj/PDFJs-ER is neck and neck with Mal/Iframe-F as the most prevalent item of malware currently being detected on web sites.

A quick peek at the URLs for the PDFs reveals a whole host of new domains, just registered in the past few days. Curious, I grabbed a few samples and set about digging further into the attack…

The first thing to notice is the cunning manner in which the attackers have hidden the JavaScript within the PDF itself. We are used to the usual obfuscation tricks being used to obscure the nature of the script content, but in this case, the bulk of the malicious script is actually carried as a string, within the subject of a page annotation object!

To my mind this is very much akin to the anti-emulation tricks we have seen used in malicious web pages, where the guts of the JavaScript content is stored (obfuscated) as strings within HTML elements in the page. A short stub script is then responsible for extraction and deobfuscation of the data using GetElementById(). Leafing through the Adobe JavaScript API, I suspect there are several other obfuscations for hiding script content within other PDF objects.

Back to this PDF example, a short embedded JavaScript sets about retrieving and deobfuscating this string.

Manually extracting and deobfuscating the string is fairly straightforward. This revealed a script containing yet another layer of obfuscation. This layer is mildly polymorphic across samples (changing variable and function names) but consists of a decryption loop using the infamous arguments.callee followed by a call to the function passing in the encrypted string.

Removing this layer of obfuscation got me to my goal – identification of the payload. Oddly it seems like an awful lot of effort to go to in order to hide code that targets a pretty old Adobe vulnerability (CVE-2007-5659). Though it should be noted that aside from the PDFs, these attacks also involve other components targeting other vulnerabilities such as DirectShow (CVE-2008-0015).

Anyway, if successful, the payload malware (Troj/Agent-LWP) is downloaded from the same domain and the user infected.

This attack typifies the approach of today’s malware authors. Aggressive, cunning and well coordinated. Increasingly, we are seeing the tricks that have become commonplace for scripts in web pages being applied to scripts within PDFs. Now it appears GetElementById() has a counterpart in getAnnots(). To protect against this and future attacks, users should ensure they have quality content scanning and URL filtering in place, patch their OS, browser and applications, and browse safely.