How fast fingerprinting of OLE2 files can lead to efficient malware detection


Baccas and Edwards presenting at VB 2011 At last week’s Virus Bulletin 2011 conference Paul Baccas and Stephen Edwards from SophosLabs presented their research paper “Fast fingerprinting of OLE2 files: Heuristics for detection of exploited OLE2 files based on specification non-conformance”.

They may win the prize for the longest title, but what does it mean? OLE2 is a container format synonymous with Microsoft Office files, although it is used for many other purposes.

Baccas and Edwards did an analysis of both clean and malicious OLE2 files to determine whether conformance to the official specifications for the OLE2 format could be used as a heuristic to discern between benign and malicious files.

They pointed out many poorly defined parts of the specification injecting a bit of humour, a welcome change at a serious conference like VB.

Fast Fingerprint title slideTheir conclusion? Using heuristics to classify OLE2 files that are more likely to be malicious based on non-conformance is a useful tool for grouping samples to decide which ones deserve deeper inspection.

Microsoft Excel files in particular showed promise as over 96% of files provide the required information within the first 8KB, which is often less than anti-virus engines already parse to determine if macros are present.

In addition to their results, their paper includes source code demonstrating the techniques used to conduct the research which should prove helpful to others tasked with efficiently detecting malicious OLE2 files.

Thank you to Virus Bulletin for granting us permission to share the paper and slides.