Sophos Techknow – Malware on Linux: When Penguins Attack
Podcast: Malware on Linux: When Penguins Attack
Presenter: Paul Ducklin [PD].
Guest: Chester Wisniewski [CW].
Date: July 4, 2015
This transcript has been edited for clarity.
START OF PODCAST
[FX: TECHNO-TYPE MUSIC]
PD. Welcome to Techknow, where Sophos experts explore, explain, and hopefully help you to understand the often baffling world of computer security.
Techknow is presented by me, Paul Ducklin…
CW. …and me, Chet Wisniewski.
PD. Chester, in the past, when we’ve done these, we’ve had science-and-engineering titles, like “All About Java” or “Understanding Vulnerabilities“.
But today’s title is the much more intriguing “When Penguins Attack.”
So, let’s kick off with you telling us what you mean by that.
CW. Well, I was invited to speak at a Linux conference on a security topic, and I started looking into it and thought, rather than get into the debate about Linux malware in and of itself – which is malware attacking Linux systems, the same way we perceive malware attacking Windows systems, or Mac systems, for example – I thought it might be interesting to look at the other side of it.
What role does Linux play in spreading malware?
We clearly think of Linux more as a server operating system, or perhaps the operating system that powers the cloud, more than we think of Linux as a desktop computing platform.
PD. So the “Penguin” refers to Tux, the Linux logo, right?
CW. Yes, exactly.
And when you look on Google for “angry penguins”, there are actually quite a lot of interesting things out there, but very little about Linux.
So I thought maybe it was time to fix that.
PD. It seems that the background to your research, and indeed the title “When Penguins Attack” rather than “When Penguins Get Attacked,” is reflecting the fact that in the malware ecosystem, or the cybercrime ecosystem, Windows is the primary platform that the crooks want to *infect*.
Whereas Linux is, if you like, he enabler or the *infector* – the delivery platform.
CW. Yes, that actually is one of the conclusions I came to.
But I wasn’t really sure when we began the research.
Those of us in the security community like to think of Windows as being the weak link in the security chain, and I thought, “What is the real impact of the operating system hosting all of this stuff?”
Is it Windows? Is it Linux?
What is the mix of how likely a given operating system is to be part of that attack chain?
PD. So tell us, in your research, what prevalence you found for the various server operating systems – Windows, the BSDs, Linux, et cetera – and the various software stacks that provide networking, web servers, and database backend services.
CW. Well, I took a look at 178,000 URLs that I gathered from about a week’s worth of data in SophosLabs, and I took a look at that by platform initially.
Is this theory that Windows or Linux is more involved or less involved even valid? Is the mix of malicious websites, say, identical to the mix of legitimate websites?
That they’re equally likely to be involved in the attack chain?
PD. Just as a quick aside, let’s just look at those numbers.
That’s 178,000 *new* malicious URLs – in other words, stuff that wasn’t in our database before – in one week?
PD. So actually, whether it’s Penguins attacking or Windows attacking, there’s still an awful lot of it out there!
When you went looking, how did it all break down?
CW. In the end, I discovered that 20.1% of the services that were spreading malicious web content were hosted on some sort of a Microsoft platform, the majority of those being Internet Information Server [IIS], but there are lots of different Microsoft web servers, so I sort of lumped them all together.
The next one I looked at was Apache on UNIX and Linux hosts, which was 36.5% of the malicious URLs that I discovered, followed up by 15.9% running nginx, a popular high-speed web server and proxy used more on the enterprise side of web hosting.
And then, after that, 9.6% were running some sort of a Google-identified web server, meaning it was operated in some way by Google.
And then it falls off from there.
PD. If you take out that 20.1% that are Windows, probably IIS, the rest are almost all running on Linux, aren’t they?
PD, Whatever web server platform and database backend they happen to have?
The ones I was able to identify as Linux specifically were about 79%, so there’s a fraction of a percent in there that I was not able to identify, or identified as being some sort of BSD, like FreeBSD or Mac OS X.
PD. It does look, just at first blush – 178,000 brand-new, malicious URLs in just one week – 80% of them are hosted on Linux.
Their goal is mostly, if I’m not wrong, to get malware onto a Windows computer, to make money out of the victim in some way.
So why is it that Penguins attack? What’s the reason that so many Linux servers are getting infected?
CW. Well, there’s no definitive way to know precisely how these Linux boxes came to be compromised.
But one of the things I was trying to do, with the data set that we had, was to figure out how many of them are innocent websites that have been compromised by criminals, instead of intentionally set up by criminals to host bad things.
And then I wanted to look at the data of those innocent sites and see if there were any indicators as to why a criminal might pick a Linux box over a Windows box.
The data was pretty interesting from that perspective.
I was worried that a lot of these sites might get cleaned up very quickly after we discovered them being attacked.
So, with one week’s worth of data, I was working very quickly to see if I could figure out what kind of thing they might be hosting that was malicious, and how they got infected, fearing that they would be cleaned up and taken offline and I wouldn’t be able to gather the data.
Unfortunately, that didn’t happen.
Nearly all of them were online for the week after I gathered the data, and in fact many of them were online even a month later.
This suggests, perhaps, that one of the reasons that criminals may choose to target Linux is that it has a long time of infection – that people aren’t closely observing their servers, and aren’t aware that they’re infected, and aren’t cleaning them up.
PD. So, 80% of the servers that are trying to foist malware on Windows users are running Linux.
What percentage of those, do you think, are actually run by the crooks as part of their own infrastructure, the core of what they’re doing?
And how much is basically free malicious hosting that they’re borrowing – or, rather, let me say, stealing – from people whose security isn’t quite up to scratch? How did that divide out?
CW. Well, the numbers divide out about the same.
I think again we’re at about the 80-20 mark when I look at the whole picture.
PD. And, presumably, the reason they go after your server and my server, rather than hosting the servers themselves, is exactly the same reason that they use end-users’ Windows computers for sending spam.
(A), it makes them a moving target, and (B), someone else is footing the bill.
And (C), they can cash out on your positive reputation.
Sites like Google and Bing, that rank pages for search, often look at things based on their reputation to decide on their PageRank, as Google calls it.
And so, by commandeering sites that are hosting legitimate content, that perhaps have been around for a year, or five, or maybe even 10 years – they’re going to rank more highly in search results, and perhaps have their own organic traffic, as it’s called in the marketing department, coming to them to find more victims.
PD. So if somebody does take action by saying, “Hey! You’re going on a blocklist,” it’s not the crooks, it’s the person who was running the insecure server.
And it’s a real nightmare for the victims, because you can imagine all the different security companies out there that discover your website is hosting something malicious and put you on a blocklist.
PD. OK, so the crooks are getting into other people’s servers.
(A), they can steal their PageRank; (B), they can steal their bandwidth; (C), they let the other person take the blame.
Once they’re in, how do they get the malware from these delivery computers, from the infected Linux servers, onto the victim’s Windows computer?
What are the primary vehicles they use for doing the “Penguin Attack,” if we can call it that?
CW. Well, usually the innocent websites just have a redirection link of some sort embedded in them that just points to something malicious down the line.
Most large web hosts and providers are intolerant of hosting the malware itself on their network, and if they hear about or discover it, they’ll very quickly shut down those websites.
So, when I looked at the dataset, large providers like RackSpace were nearly non-existent for hosting a Windows Trojan file, for example, because they very carefully monitor their networks.
What the criminals want to do with these innocent sites that are infected, is simply use them as the top of a funnel – to let you slide down the funnel, and redirect you towards something at the bottom of the funnel.
Something that’s maybe hosted somewhere more dodgy that’s more willing to host bad things, that the criminals can put exploits on.
PD. So this is the kind of thing we call an “exploit kit,” right?
It picks from a list of available exploits that are likely to work, and then tries them one-at-a-time in the hope that one of them will work.
And if it does, “Bingo!”
Then they stuff malware on your computer – and they can choose what malware they want to deliver at delivery time, can’t they?
In fact, they look at your geolocation, or other things, and put you into a “bidding marketplace” for different criminals to bid on how much they’re willing to pay to put their malware on your computer.
If they’re currently deciding they want to do a Denial of Service against German banks, perhaps if you’re a German user hitting that website, they may install a DDoS bot onto your computer to make their attack be more localised and more powerful.
But maybe you hit that same malicious website from Japan, and maybe today they’ll install some ransomware to steal your files and lock them up with encryption so you have to pay a ransom to get them back.
In the criminal marketplaces, they actually have bidding on this stuff – and, of course, the criminal controlling all those infected websites wants to maximise the amount of profit he’s going to get by selling people to the highest bidder.
PD. That was the next question I wanted to ask.
If you’re the guy who’s paying to have your malware delivered, the money you’re going to be making will come from things like spamming; the data that you steal; DDoSes you start; the ransomware you install where people pay the fee; and so forth.
But if you’re a crook running the server side of the business, how do they charge for those services?
What are you buying when you’ve written some malware and you want to get it out, say, to victims in New Zealand?
What do you actually buy from the crooks who are running the Attack Penguins?
CW. Usually, it’s done in what’s called “Pay-per-Install,” and that’s what I was referring to with some of this bidding.
As a budding malware author who’s go some new ransomware, I’m going to go to a forum and say, “I’m willing to pay $0.75 per victim if you can distribute my malware for me, and the victims I’m looking for run Windows…or OS X…or are in Germany.
And, usually, the more specific the criteria, the higher the price you’re going to pay, as a criminal, to buy those services.
PD. It sounds as though what you’re saying, really, is that even in what we might rather casually call “broad-brush cybercrime,” almost all attacks have some element of targeting in them, don’t they?
It’s a capitalist marketplace for the criminals, so as they’re getting more and more refined, and getting more and more specialised into different types of crime, they want different types of victim.
What do we do to take the Attack Penguins out of the equation?
If you’re someone who’s buying time on a hosted server that’s running Linux, or if you’re the operator of those hosted servers yourself, what do you do to try to cut the crooks out of the equation and begin to win the game?
CW. Well, I was very cautious in my research not to break any laws.
So I don’t have solid facts, for example, as to how many of these sites were victims of vulnerabilities in a particular software application, or a library that wasn’t patched on the system.
Because to find out if they were actually vulnerable, I’d have to attack them – which I wasn’t willing to do.
But I was able to gather a lot of information about how up-to-date those servers were, to give me an idea as to how likely that might be why they were attacked.
And we also have a lot of other information anecdotally, from helping victims who reach out when they have problems, to discover some of the ways people are being attacked.
I think the two primary ways Linux boxes are being targeted are through software vulnerabilities and through credentials being stolen.
Many websites are still being updated by things like FTP, where those credentials are not only transmitted in plain text, but also often cached or stored in web browsers or FTP helper programs for publishing websites, where they may be stolen by other malware.
PD. So as well as that mantra we often repeat, “Update early, update often,” there’s also another key part that we don’t always say, isn’t there?
In other words, it’s not good enough to update your Linux and then your LAMP stack if you then don’t go and update the special plugins – which you added because they seemed like a good idea at the time – that are actually directly processing potentially hostile remote content, like images, or emails, or blog comments.
CW. A rather famous example from last year for WordPress was one called “TimThumb,” that’s a thumbnail generation plugin for making thumbnails of your images on your blog.
It was installed on millons of WordPress sites around the world, and it had a vulnerability that had nothing to do with updating the content management platform itself.
You had to go and get this fix from a third party – and I think that’s where we often get ourselves in the most trouble.
PD. And my own experience as a long-time Linux user is that the open source world is very rich in dependencies, isn’t it?
I’ve often done something like “package-get new-software-i’d-like-to-install” for something fairly modest, say, a graphical calculator, and I’ll get a list back saying, “Oh, and you also need to install the following giant list of libraries.”
And, of course, any one of those could bring a problem with it.
So it seems that when you’re configuring a Linux server, whether it’s the host itself or one of the virtual machines running on it, it’s really important to know which bits depend on what.
And that map may be non-trivial to draw, but you really have to do it, don’t you?
CW. Yes, you do.
I think we’re getting better at this, but we’re not necessarily taking action on it.
You perhaps need to get yourself in the habit, just like Update Tuesday, of picking a date in the calendar, setting it “recurring,” and saying, “Hey! It’s the 10th of the month. I’m going to check all my Linux boxes and make sure they’re up to date, make sure those processes have been working and I got all those fixes.”
From the evidence I saw, looking at the version numbers of Apache web servers or nginx web servers, the version numbers of PHP being advertised by some of them – the vast majority of the systems that were compromised were not just a little bit out of date, but at least a *year* or more out of date.
PD. OK, Chester, I want to get towards wrapping up now, so I’m going to ask you the $64 question that so often causes flame wars in the Linux world.
Can a Linux anti-virus help you?
CW. Well, I’d like to think the answer is, “Yes.”
When I looked at this, of course, all these sites were detected by our products as being infected, and that suggests that at least 178,000 that week were something anti-virus would have helped.
And if that’s not enough reason, I don’t really know what is.
And anti-virus, in addition to just looking for Trojan Horses for Windows, can identify that malicious code when it’s stored in your website, and can alert you to that fact so you can protect your visitors and customers.
PD. There is a great irony here, isn’t there, that on Windows, everyone is scrambling to make sure that their anti-virus is preventative, has an on-access scanner that will stop you getting infected in the first place.
The days of just detecting after the event are over: that’s a last resort.
But it seems that in the Linux world, where one server could end up infecting hundreds of thousands of Windows users, some sites – at least in your research – were staying infected for so long that even if they’d done a virus scan once a week, they’d have been in a much better state, and helped the next guy that much more.
I recommend people use an on-access scanner on Linux no differently that they would on Windows.
But for those that fear the performance impact they perceive that might incur…which I don’t really think is there, but if you think that, why not run a scan daily or weekly at 3am in a cron job, just to check everything and make sure it doesn’t look as though anything’s been altered by any malicious code?
I don’t see why you wouldn’t.
In fact, Sophos Anti-Virus for Linux is available entirely for free.
So there’s no cost in performance; there’s no cost in dollars.
I think it would be silly not just to have a little health check once in a while to make sure that things look OK.
PD. It’s just like full disk encryption.
In many cases, people are against it because they say, “Ooooh, it’s going to waste 4% of my CPU.”
They try it, and either it doesn’t eat anything at all that they can measure, or they realise that the difference is so negligible that it doesn’t affect their ability to work at all, or it doesn’t affect the ability of their server to deliver files to their users.
CW. Yes, and I think we’re finally getting past that stage where CPU is really a factor in anything.
You know, I’ve got six cores in my phone that I’m not sure what to do with…
PD. [Laughs] Sounds like a song.
CW. Yeah… [laughs]
PD. “I’ve got six cores on my wagon, and I’m still rolling along.”
CW. But on our web servers, at one point, we were also fearful of the impact of having SSL enabled, or TLS enabled, for the performance impact.
I think Google and Facebook have published study after study showing that is completely a myth at this point.
Something like anti-virus, encryption, TLS: all of these things are quite irrelevant to performance in a modern operating system, considering how beneficial they are to us when we have them.
PD. Bit of a no-brainer, isn’t it?
CW. It is.
And I think, as professional IT people, we need to think about our friends and family as well.
One of the things that struck me about the data was how many of these sites clearly are operated by non-professionals for their small businesses; for their church group; for their soccer club; this kind of thing.
We need, I think, to get away from telling people, “Hey, go to CheapWebHostX and for $5 a month you can set up a blog to tell everybody when the next game is, or when the next gathering is.
Instead, maybe start urging our friends and neighbours, if they’re not professional IT people, or they can’t afford professionals to do the services for them, maybe this is a really good opportunity to move to the cloud, and use a cloud-hosted blog instead.
Where professionals *are* there to maintain security, and monitor those sites for compromise, and provide a little extra layer of benefit.
PD. Chester, I think that’s a great place to end.
What I’ve realised out of this is, once again, proof that an injury to one is an injury to all – or, in this case, an injury to one server can be an injury to thousands, hundreds of thousands, of users.
So, if you do have Linux severs, don’t be one of those who let your Penguin attack.
Be defensive, and stay strong against the crooks!
By the way, if you enjoyed this podcast, we have plenty more at soundcloud dot com slash SophosSecurity…
…and, until next time, stay secure.
[FX: TECHNO-TYPE MUSIC]
END OF PODCAST