Later this year, I will be presenting a paper in Barcelona with my colleague Stephen Edwards at the Virus Bulletin conference. It’s a great meeting-of-minds, where businesses and anti-virus experts get together to discuss the latest threats and technologies to counter them.
The talk Stephen and I will be presenting is called “Fast fingerprinting of OLE2 files: heuristics for detection of exploited OLE2 files based on specification non-conformance”. With a title like that you won’t be surprised to hear that our paper is being given on the more technical side of the conference. 🙂
OLE2 files have a long history when it comes to being vectors for malware infection.
In the mid 1990s it was macro viruses that caused problems (Remember the Concept virus et al?).
However, in recent years a growing challenge has been that of targeted attacks using exploited OLE2 files.
One of the problems when scanning OLE2 files is that they are in effect a filesystem in and of themselves (you can embed Word files into Excel files and EXE files within the original Word document ad infinitum).
Users want to access their files quickly, so in any malware-scanning tool there is a trade-off between speed and thoroughness. When scanning complex file formats (which OLE2 files are) shortcuts to malware detection are much sought after.
While analysing some exploited OLE2 files recently we saw that they did not conform to the specifications and so we wrote some tools to check conformance.
Unfortunately, we found clean non-malicious files were also failing to conform to the OLE2 standard but in subtly different ways. So we attempted to group the files (clean and malicious) by where they violated the specifications.
Initially, we did this with approximately 10,000 Excel files.
When we clustered the results of our non-conformance specification test, we found one particularly prominent group.
Closer examination of the group of files, has found that 71% exploit the CVE-2009-3129 vulnerability (which was patched in Microsoft’s MS09-067 advisory), and Sophos correctly detects them as Troj/DocDrop-S.
The exploit being used by the Trojan relies upon the FeatHdr BIFF record (See page 300 of Microsoft’s Excel binary file format specification).
Why is this relevant to the challenge of making malware detection quicker? You’ll have to come to the Virus Bulletin conference in Barcelona to find out!
Here is a technical snippet of what we will be presenting on the Friday of the conference.