Google adds (some) malware and phishing info to Transparency Report

Filed Under: Featured, Google, Malware, Phishing, Privacy, Web Browsers

World wide web. Image courtesy of ShutterstockGoogle has expanded its Transparency Report data to include stats from their 'Safe Browsing' system, which keeps tabs on where malware and phishing sites are hosted.

The data is a little short on definition, but it does give some interesting insights into which hosting providers are doing the worst job of keeping their IP space clean.

The twice-yearly Transparency Report has traditionally covered more politically-sensitive topics - which countries are blocking access to Google services, and who's been asking Google to provide data on their users (or "product"), or to take stuff down that might be found offensive for some reason, or in breach of copyright.

Some of this stuff is interesting in itself, not least when it very nearly names-and-shames dodgy political and judicial figures trying to abuse their authority and silence their critics.

There's also quite a big question mark hanging over just how "transparent" it all is, in the light of the whole PRISM brouhaha.

For the most part it seems fairly detailed and fine-grained though, or at least gives the impression of trying to be, as far as "the man" will let them, with some of the data even provided as spreadsheets for proper looking at by proper science-y types.

The new data is based on the Safe Browsing programme, which combines scanning by Google and reports from the wider web world to keep tabs on where the bad stuff is at; browsers use the data to filter search results, to protect their users from potential malware and phishing.

It's a little less detailed; much of it consists of little graphs showing trends of malware and phishing spotted over time. Some is rather hard to find much value in, data for related topics covering wildly different time periods and thus hard to compare.

Some of the graphs seem more useful, but may not be; an apparently clear, if somewhat loose, correlation between the number of malware sites and phishing sites picked up at any given time may imply a definite link between the two activities, but could also simply be showing how hard the Google scanning crew were working that week.

The one graph which does seem clear is the contrast between "attack" and "compromised" sites - i.e., sites deliberately set up to get you, versus legitimate sites that have been taken over by the bad guys. The graph shows actual attack sites on the increase recently, but still barely registering - it seems the compromised sites outnumber them massively, and always have.

Again, there is, of course, room for some sampling bias here - it's quite possible that the attack sites are better at hiding from Google, and of course they have no legit owners or admins to spot the compromise and report it.

Some numbers are available for these graphs, but they require some mouse skills to hover over the exact spot you're interested in.

The real detail is on the "Malware Dashboard" page though. This breaks down the sites recorded by the Safe Browsing scheme by Autonomous System (AS - basically an ISP or other large-ish body responsible for a subsection of the internet).

Google malware dashboard

It provides a rather undramatic world map highlighting which geographic regions are especially malware-ridden (nowhere's that much worse than anywhere else, it turns out), but then also breaks down the data by AS, including details of how many threats have been spotted in each.

The clear leader recently, using the default three-month view, is one called "Webair Internet Development", a US-based ISP on which Google has found 43% of sites checked have been malicious.

Looking at a sample of the domains they host seems to confirm some old stereotypes - it seems to be remarkably popular with gambling, pharmacy and porn sites, with domain names like "top3casino", "247-pharmacy" and "seemyass" jumping out of the list.

This impression is reversed by checking into the next two in the list though, American Access Integrated Technologies and Spain's True Records; both are listed as hosting 40% bad sites, but both are apparently hosting a random selection of legit-sounding domains (although, of course, there seems to be a fair amount of porn in both).

Again we come back to sampling error though.

The Webair listing says 43%, but as you may have spotted, that's 43% of sites checked. In the period covered, Google has only actually looked at 2% of the sites hosted there. So, it all comes down to how good the Safe Browsing team are at deciding which sites to check.

If they're super hot and have pinpointed all the bad stuff in the whole AS with just a few misses, we've got 43% of 2%, aka 0.86% - not such bad guys after all.

On the other hand, if they're really terrible and have foolishly started their scanning with the handful of clean sites on a seriously malware-riddled section, it could be as high as 98.86% danger.

That's the problem with stats, really - and we're not even considering whether the results of the Safe Browsing checks could be in error.

Looking at the longer term, by turning the dial up to the maximum 1 year, the top five are all in the 80s and 90s, apart from number 1 which, rather intriguingly, is listed as "unknown" - they know it's the biggest, but can't say why.

All this top five also list the % of the total AS scanned as "unknown". Not much for those real science-y people to play with here unfortunately.

So what's the use of it all?

Well, the actual data on whether or not your site is listed is made available to site admins, which is helpful, but there's nothing new here. The main value of this new regular report, it would seem, is to highlight potentially dodgy providers.

So, if you're running a website and your provider comes high up in one of these lists, get in touch with them. Ask them, hey, what's up, are you some sort of haven for crooks, or just incompetent?

If they really are dirty, you might just get them to clean up their act. If not, you'll at least be helping keep them on their toes.

And if you've somehow got your mum's flower arranging club website registered with a Russian 'bulletproof' provider, then maybe this should give you fair warning it's time to move it on.


Images of world wide web courtesy of Shutterstock.

, , , ,

You might like

3 Responses to Google adds (some) malware and phishing info to Transparency Report

  1. coachdaddyblogger · 299 days ago

    John - if you could advise Google on a next step for this, what would it be?

  2. Laurence Marks · 299 days ago

    John Hawes wrote "Looking at the longer term, by turning the dial up to the maximum 1 year, the top five are all in the 80s and 90s, apart from number 1 which, rather intriguingly, is listed as "unknown" - they know it's the biggest, but can't say why."

    Perhaps this indicates that Google has identified the IP address of the scanned site but cannot figure out what AS it resides in. This "unknown" could be a bunch of small obscure ASs. Thinking about it, isn't that what a malicious server operator would want to do? Have his own AS so complainants would have to use traceroute to figure out who his backbone provided is and complain there?

    In the 90s (when I had a lot more time than I do now), I used to run down spammers by analyzing received email headers and then using traceroute to find their backbone connections, then notifying everyone for a couple of layers up the chain. Some spammers actually ran through two or three domains of indirection. This worked pretty well except for Spamford Wallace who had a direct backbone connection.

  3. maureen m. · 295 days ago

    I applaud Google offering customer access and monitoring of Account Activity. On May 26 I was being alerted by Google regarding activity on my Web Browser. A persistent cyber attacker has been present since March 25, despite wiping my devices and the addition of anti-malware reinforcement. I arrived at the scene just in time to find that my hacker had accessed browsing via a cracked Gmail and was surfing cyber terrorist websites. Declining discussion, I was shut down and suspended 30 days.

    When another intrusion occurred June 25 resulting in wiping my Web History, one Gmail disabled and ,another closed completely I was alerted and saw two separate IP's- one in L.A., the other N.Y. This time I scrambled to generate a quick note requesting their attention to that fact. I also had them check my recent history which identified my attack, and informed them of my retention of private investigators doing forensics at that moment. My greatest relief was over clearing my name, and not risking being placed on some watch list.

    Google is improving, but access to customer support in a situation like this is necessary.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

About the author

John Hawes is Chief of Operations at Virus Bulletin, running independent anti-malware testing there since 2006. With over a decade of experience testing security products, John was elected to the board of directors of the Anti-Malware Testing Standards Organisation (AMTSO) in 2011.