Hash Busting Blogs

You are likely to be familiar with the concept of ‘hash busters’ within spam. Hash busters are the seemingly random words or sentences located at the bottom of a spam message, used to try and bypass a variety of anti-spam techniques. For those that are interested, here’s some background.

Bayesian filtering was hailed as the ultimate anti-spam technique a few years ago, as it used machine learning techniques to adapt itself to different mail messages. By indicating whether a message was good or bad, the Bayesian filter quickly learned the type of email you wanted to receive. Bayesian filters worked reasonably well on the desktop, where an individual user was doing the training. At the gateway, it was less successful, thanks to the huge variation in email messages that users wanted to receive.

Another anti-spam technique is to use a large network of machines, either at desktops or at email gateways, and identify a ‘fingerprint’ (sometimes called a hash) to see whether it appears in large amounts of email going to different people at the same time. The theory is that if you see exactly the same message in lots of places at the same time, it must be spam.

Spammers use hash busters to minimize the effectiveness of these techniques by inserting random words or sentences at the end of the email. This text makes differentiates one spam from another, complicating the process of fingerprinting an spam message: it is more difficult for automated analysis to spot a spam message when the messages are no longer identical. The hash buster is sometimes visible to the recipient, while in an HTML email, it can be tagged as white text on a white background so it is essentially invisible to the user.

The networks of compromised machines (known as bots) that are used to send out the spam vary the hash buster text very regularly. Analysts at SophosLabs see hash buster text from a wide variety of sources – it can be a selection of random words or extracts from news articles, books and so on. In fact, an analyst working on a weekend was able to tell me the plot of a book based on what he’s seen in a spam campaign that erupted during his shift.

Typical spam message, with hash buster text

Today, I noticed an interesting hash buster discussing the role of database administrators (DBAs) and I decided to investigate. A quick search gave me the answer: each paragraph was an extract from a blog by a database management specialist.

I suspect the spammers are using RSS feeds from blogs to source paragraphs as hash busters. So if part of this blog entry is located at the bottom of an unsolicited email message, visit www.sophos.com and learn how to stop receiving spam.