Bots on LiveJournal explored

LiveJournal is a mixture of social network and blogging platform. It is multilingual, but most popular with Russian-speaking users.

The peculiar security model of this site makes successful bots quite valuable to their creators. A rich API makes it possible to automate all the operations involved.

So, the hackers have the means and the motive. But users are getting smarter at spotting bots, so the botmasters are using ever-more advanced techniques.

Let’s have a look at some of these techniques, which may make the jump from LiveJournal to mainstream social networking sites in the near future.

First of all, some background on the security model. LiveJournal posts can be public – visible to everyone, including people without LiveJournal accounts. Or they can be friends-only – visible only to LiveJournal users on the poster’s friends list.

This sounds great in theory: the user can fully control who can see his posts, as friends can be split into separate groups, such as “family” and “co-workers”. Each post can be targeted at a specific group.

LiveJournal

The trouble starts when we remember the dual nature of LiveJournal – to add another user to your aggregate feed of the blogs for reading you must add this user to your friends list. But by doing this, you make your private information visible to this user. Are the people whose blogs we want to read necessarily the same people we want to be able to read our restricted info?

Not always. For example, I’m interested in the latest announcements from HM Revenue and Customs (the UK taxation office), so I would like to add HMRC to my feed. But I don’t want them to be able to access my private information.

So, why is it valuable for a botmaster to trick another user into befriending a bot?

* The botmaster gets access to your private information. This includes information such as date and place of birth, schools attended, and the primary email address used for registering with LiveJournal. All these are useful for password attacks on your LiveJournal account (and possibly also on other, more important, accounts).

* The botmaster can leave comments with hidden links. For example, this sort of link is not visible to the user, but counts as a reference to the botmaster’s web site:

Hidden links on LiveJournal

* The botmaster improves the bot’s reputation for future use.

To maximise exposure, bots monitor postings and try to leave their comments near the top of the list.

But the really lucrative spamming opportunities lie with communities. Communities unite users with common interests. A community looks like any other blog, can be added to the aggregate feeds, and all community members can post to it.

From the bot point of view, posting is much better than commenting – it will be seen by all the community subscribers.

Community spam is much easier to target: if you promote your range of magic frying pans in the community gourmet_cooking, and your Viagra in erotic_art, then the results will probably be better than targeting a similar number of randomly-chosen users.

Also, the HTML content of posts is richer than what is allowed in comments, so drive-by infections are possible, too.

Some communities protect themselves from spamming by pre-moderating posts. Pre-moderation of content is an effective solution for fighting bots. However, popular communities are generally high-traffic, so pre-moderating each post is impractical. Pre-moderation of membership is usually used instead.

Pre-moderation on LiveJournal

Before approval, the moderator will look at each new member’s profile, trying to distinguish between legitimate users and bots.

Here’s an example:

LiveJournal profile

This screenshot was taken on 19 October 2010. The profile was created just one day before, has never posted, and never commented. This is obviously suspicious. Let’s keep looking:

LiveJournal profile

The profile has no friends, but has joined 227 communities since yesterday. This is obviously a bot, so we reject and ban it.

However, not all bots are this easy to spot. Botmasters can use numerous tricks to create more realistic-looking bots:

* Be patient. Create a bot in advance, for use months later.

* Be creative. Steal content from other blogs, making the bot look legitimate. (The API helpfully allows you to backdate your entries.)

* Be interesting. Add interests to the bot which look relevant to the community subject.

* Be local. Make the bot look as though it belongs in the country or region relevant to the community.

* Be friendly. Trick individual users into befriending the bot, so its balance of “Friends” to “Friend of” looks healthy.

Some bots show considerable ingenuity in building up fake friendships. As an experiment, I created two similar accounts. One did not respond to befrienders; the other befriended back everyone who asked.

Once the “friendly” account had added a couple of bots, the number of bots befriending that account went up rapidly compared to the unfriendly one. Bot control programs recognise which accounts are more promising in improving their reputation. Most bots remove you as a friend if you haven’t added them within 24 hours, thus keeping their “Friends” to “Friend of” ratio healthy.

MimicA really smart bot will use mimicry: before knocking on your door, it will copy some of your interests into its profile. It will analyse your list of friends, and try to befriend some of them first.

If the bot has friends in common with you, they will be highlighted in the bot’s profile, which may make you more relaxed about adding the bot to your list of friends.

Building on this approach, community bots will attempt to reach as many community members as possible before trying to join the community itself.

Bots also copy the common human behaviour of reading and commenting on someone’s blog before actually adding that person as a friend. A bot of this sort may only offer you friendship after a week or two of diligently “reading” your blog and leaving comments like “Well said, Sir!”

You are more likely to think you’ve seen this account before and actually held intelligent conversation with it. Vanity is the social engineers’ favourite trait.

Bots need to evolve constantly, to avoid recognisable behaviour patterns. Experienced users notice when names of new “friends” look computer generated, so the bots moved on from random letters and digits to using dictionaries and regularly varying their name generation algorithms.

They can now match stolen content to their declared interests and geographical location. They leave context-aware comments, depending on the specific blog entry.

To conclude, here is a quote from my (definitely human) friend, a man of various interests and moderator on several busy communities:

"Yesterday, someone asked to join both functional _programming and baroque_music. For a split second I felt a profound joy at having found my perfect soulmate. Such a shame it was a bot".