Security experts are playing a game of cat-and-mouse game with malware authors who are continually looking for ways to bypass detection by anti-malware products.
As regular readers of Naked Security will know, one commonly-seen method of distributing malware is to embed an attack inside a malformed PDF. And, one way to hide code inside a malicious PDF is to use filters.
Filters are used by PDFs to compress or store data to either make the file smaller (Flate, CCITTFax) or allow it to be read as text (ASCIIHex).
By combining the filters in weird ways the malware author hopes to bypass detection by malware scanners and deliver a malicious payload to the victim.
Last April, we saw some PDF malware using /DecodeParams filter to obfuscate malicious code.
When I saw it I knew we would see more PDF malware using image filters to obfuscate malicious payloads.
Sadly, that prediction appears to have become true.
Late last month, while analysing samples received via the Wepawet project I saw the first use of the CCITTFax filter to hide malicious content (detected as Troj/PDFJs-WT by Sophos products).
As you can see below, the stream embedded in this sample is encoded. You can see the use of
/Filter, which indicates that the data in that stream is encoded by one or more filters (shown in square brackets).
In the case of this sample, the filters in use are:
ASCIIHex (again) and
/Filter [/ASCIIHexDecode /CCITTFaxDecode /ASCIIHexDecode /FlateDecode] /DecodeParms [ null << /Columns 28176 /Rows 1 >> ]
According to Adobe documentation (PDF 32000-1:2008):
“The ASCIIHexDecode filter decodes data that has been encoded in ASCII hexadecimal form.”
“The CCITTFaxDecode filter decodes image data that has been encoded using either Group 3 or Group 4 CCITT facsimile (fax) encoding.”
“The Flate method is based on the public-domain zlib/deflate compression method, …”
Of the three filters used in this sample, only CCITT has parameters that allow it to be controlled. In this case:
Nullmeans that this is a Group 3 1-D encoding and that the image is on one row with 28176 elements
- CCITT Group 3 1-D encoding is a variation on a Huffman encoding scheme where the image is split into 1-bit white and black pixel runs (white run length and black run length written below as wrl and brl respectively)
- each run length has a tailored Huffman encoding
To illustrate how to decode the encoded stream, I am going to use just the start of the encoded stream:
This would be converted to hex values and then decoded by the CCITT decoder. As CCITT is a bit-based encoding stream (rather than byte-based), we must convert the above string to binary:
0000 0000 0001 0011 0101 1101 1101 0100 0111 0000 0110 0011 1000 1110 0011 0001 1101 0011 1100 1111
We can break this down into specific segments:
0000 0000 0001: the EOL marker and this should start the encoded data.
0011 0101: the code for a wrl of 0.
11: the code for a brl of 2.
0111: the code for a wrl of 2.
010: the code for a brl of 1.
1000: the code for a wrl of 3.
So, we can write these bytes as:
We can write this in two ways, depending upon whether we consider b as 1 or 0:
- b as 1 :
- b as 0 :
Needless to say I chose the wrong one when I first implemented it. The correct version is the second 🙂
So, the CCITT stream decodes to:
78 9c ed d3 f9 ...
Which when run through a ASCII hex decoder (which ignores spaces), produces:
Those familiar with PDF files will recognise that this looks like the start of a Flate encoded stream.
When you Flate-decode this, you get a font that is vulnerable to the CVE-2010-2883 vulnerability, patched in Adobe Security Bulletin APSB10-21.
As ever, SophosLabs recommends that you make sure you are on the latest version of Adobe software.
6 comments on “PDF malware adopts another obfuscation trick in attempt to avoid detection”
Is having the latest edition of Adobe software enough to protect against this? Not being a techno person, there is no way in which I could detect an infected file, although I was fascinated to read your article – but didn't really understand it!
From the article:
"When you Flate-decode this, you get a font that is vulnerable to the CVE-2010-2883 vulnerability, patched in Adobe Security Bulletin APSB10-21."
So, yes — Adobe patched this one.
Does a patch update in Adobe Reader have increment the version number or would my version number remain same after patch update is applied?
Just wonder if PDF malware can affect other PDF Readers, like Nitro, Foxit, PDF X-Change, etc.. some time ago Acrobat Reader lost it's "monopoly" on PDF and there are dozens of PDF editors/readers in the market.
Very interesting article, I think using new filters was a matter of time, and here it is…
I have a comment about this phrase: "Null means that this is a Group 3 1-D encoding and that the image is on one row with 28176 elements". I think there is a small error here, because the Null object means that the /ASCIIHexDecode filter has no parameters, the absence of /K element in the /CCITTFaxDecode parameters dictionary means that it's a Group 3 1-D encoding, because the default value of K is 0 (Group 3 1-D).
I just wanted to point this out, but the article is really good, very good explanation about decoding this type of filter 😉