Conspiracy theorists can stand down from puce alert!
A network outage that affected US providers including Google and Cloudflare on Monday, intermittently diverting traffic via China…
…has been chalked up to a blunder.
Internet traffic depends heavily on a system called BGP, short for Border Gateway Protocol, which ISPs use to tell each other what traffic they can route, and how efficiently they can get that traffic to its destination.
By regularly and automatically communicating with one another about the best way to get from X to Y, from Y to Z, and so on, internet providers not only help each other find the best routes but also adapt quickly to sidestep outages in the network.
Unfortunately, BGP isn’t particularly robust, and the very simplicity that makes it fast and effective can cause problems if an ISP makes a routing mistake – or, for that matter, if an ISP goes rogue and deliberately advertises false routes in order to divert or derail other people’s traffic.
Simply put, good news about reliable routes travels fast via BGP, but bogus news about incorrect or nefarious routes travels just as fast, until someone notices and the competent majority in the community react to correct the blunder.
That’s what seems to have happened in this case, where traffic to Google and other networks was intermittently disrupted, though fortunately not for long:
Customer behind Cogent and NTT experienced the @google outages likely in 5 waves between these times (UTC) 74 minutes total:— BGPmon.net (@bgpmon) November 12, 2018
21:13 - 21:17 4min
21:18 - 21:21 3min
21:22 - 21:28 6min
21:30 - 21:50 20min
21:51 - 22:32 41min
example ASpath: 174 2914 20485 4809 37282 15169
To envisage BGP blunders in driving terms, imagine that you are cruising on the freeway but receive a radio or satnav alert that the road ahead is closed just after the next exit, due to an accident.
You dutifully take the next exit to get off the freeway, only to find that the bulletin was wrong – it’s the next on-ramp that’s closed, not the freeway itself.
In other words: you’ve needlessly left the fast-flowing freeway; you can’t get back on it again without diverting through a nearby town; and everyone else who heard the bulletin did the very same thing, thus making a bad situation worse and clogging up the town centre.
What went wrong?
In this case, it looks as though a Nigerian ISP made a routing mistake that was accepted by a huge Chinese ISP, thereby inadvertently causing the outage – in internet terms, an injury to one can easily become an injury to all.
But was it a mistake, or should we assume some sort of conspiracy?
After all, Nigeria is popularly connected with online fraud; China is frequently accused of internet espionage; and a recently published paper explicitly claimed that China has been systematically using BGP hijacking as a cyberattack technique.
Put all of this together and it’s easy to jump to the conclusion that something deliberate and nefarious happened here, rather than simply blaming a momentary lapse.
But, as experienced network operators have already pointed out, if this were a deliberate hijack, it was a spectacularly ineffective and obvious one that didn’t work, because the ISP community at large quickly noticed and got it fixed.
Nevertheless, blunders of this sort do send network traffic where it wouldn’t usually go, giving even more people than usual to sniff it out, capture it, and comb through it later.
So it’s a great reminder of the slogan you’ll see on the T-shirts we like to wear in our live videos
Dance like no one’s watching
Encrypt like everyone is
(Yes, you can buy those shirts in the Sophos online store :-)