Researchers recently revealed a new vulnerability in the design of Tor, the world’s favourite weapons-grade privacy tool.
In their presentation, Non-Hidden Hidden Services Considered Harmful, given at the recent Hack in the Box conference, Filippo Valsorda and George Tankersley showed that a critical component of the Dark Web, Tor’s Hidden Service Directories (HSDirs), could be turned against users.
Targeting HSDirs is so easy that the researchers suggest you should avoid the Dark Web if you really care about your anonymity.
According to Valsorda and Tankersley:
Hidden service users face a greater risk of targeted deanonymization than normal Tor users … It would probably be better to let them use Tor on your TLS-enabled clearnet site.
To understand how the vulnerability works and how it chips away at Tor’s armour we need to start by looking at how Tor works and how it’s been attacked in the past.
How Tor works
Tor (AKA The Onion Router) is software that provides computers with privacy protection and anonymity.
It can be used to access the regular internet anonymously, or the so-called Dark Web where sites and services (recognisable by addresses ending in .onion) also enjoy Tor’s protection.
Tor works by routing your traffic through a handful of computers, called a circuit, that use encryption to hide your IP address from the site or service you’re talking to. The computers in your circuit, called relays, are chosen at random from a global pool of around 7,000 computers that act as Tor nodes.
Network packets are wrapped in multiple layers of encryption and sent to their destination via your circuit. Each relay in the circuit peels back one layer of encryption, revealing the address of the next relay.
Since each relay only knows about the relay before and after it, no computer in the circuit knows both the ultimate origin and destination of your traffic.
The first relay in the circuit is known as the entry guard and the last as the exit node.
That exit node (which could be anywhere in the world) is where your traffic appears to come from. It’s also a prime location for spying on or deanonymising Tor users accessing the regular internet.
If you use Tor to access the Dark Web then your traffic passes through two circuits, one established by you and another established by the .onion site you’re using, and the two circuits meet at ‘rendezvous point’ in the middle.
Like all software architectures, Tor’s design is the result of a series of optimisations, trade-offs and compromises that leave it weaker in some areas than others.
If it has an Achilles heel in its design and implementation, it’s a weakness to traffic correlation attacks; if you can observe enough of the traffic entering and leaving the Tor network then (with some fancy statistical analysis) you can match up the comings and goings and see who did what, defeating the smoke and mirrors of the circuit.
In 2012, researchers at the US Naval Research Laboratory and Washington DC’s Georgetown University investigated Tor’s vulnerability to traffic correlation attacks and concluded:
An adversary that provides no more bandwidth than some volunteers do today can deanonymize any given user within three months of regular Tor use with over 50 percent probability and within six months with over 80 percent probability.
2014’s Operation Onymous, a 17-nation sting that took out over 400 Dark Web sites, is widely thought to have involved a correlation technique developed by Carnegie Mellon University with $1 million of FBI funding.
Research has understandably tended to focus on Tor’s entry guards and exit nodes because of the visibility they give to traffic entering and exiting Tor circuits.
The focus on exit nodes has left Tor users accessing the regular web (where only a single circuit is established) more vulnerable to traffic correlation than Tor users on the Dark Web (where two circuits are used).
So attacking Tor using traffic correlation attacks isn’t just possible, it’s practical. But it’s a difficult technique to use against Dark Web users, is inefficient if you’re targeting a single .onion site, and is time consuming and expensive.
What Valsorda and Tankersley unveiled at the Hack in the Box conference was a new way of conducting correlation attacks that addresses all these things, a method that in their own words made things “way easier”.
Using HSDirs in correlation attacks
When you connect to a .onion site Tor has to perform the difficult conjuring trick of connecting a user and a service who are both trying not to be found.
To overcome this sites provide details about how and where you can communicate with them by publishing a list of ‘introduction points’ to a distributed database, known as the Hidden Service Directory. (Tor nodes that form part of the Hidden Service Directory are given the HSDir flag and referred to as HSDirs.)
Just before you access a .onion site your computer uses a formula to work out which HSDir to talk to and then asks it where the site’s introduction points are.
Valsorda and Tankersley realised that this puts the HSDir at one end of a Tor connection; correlation attacks could use HSDirs instead of exit nodes and could target those hard-to-reach Dark Web users.
If your target uses a hidden service, don’t need exit relay to see when the connection happens.
Instead, be an HSDir.
HSDirs can serve the same purpose against a hidden service as a malicious exit relay would in a basic correlation attack
It’s actually worse, because it’s way easier to be the user’s HSDir
How easy is it to become an HSDir?
You just have to provide a computer that will act as a Tor relay for four days.
Of the 7,000 nodes in the Tor network, about 3,000 have the HSDir flag and about 2,000 are acting as entry guards. With a few hundred computers acting as both entry guards and HSDirs you could give yourself a box seat on both ends of a significant amount of Tor traffic.
And it gets worse.
The researchers also discovered that it was possible to set up just a handful of HSDirs and target the users of a specific .onion site.
Targeting individual .onion sites
Every day a .onion site generates a descriptor ID and publishes a digest of it to a list that also includes digests of HSDir IDs. According to the Tor Rendezvous Specification the next three HSDirs in the list after the descriptor ID will be that site’s HSDirs that day:
A hidden service directory is deemed responsible for a descriptor ID if it has the HSDir flag and its identity digest is one of the first three identity digests of HSDir relays following the descriptor ID in a circular list.
The HSDirs change each day because the site’s digest, which is based on its ID and the date, also changes (in a deliberately predictable way) every day.
The question the researchers asked themselves is; if the descriptor ID is predictable, how easy is to muscle into the positions in the list just above a target site, ensuring that you become its HSDir that day?
They answered that question by randomly generating IDs for their HSDirs until they found one that slotted into the list in just the right place.
It took 15 minutes on a MacBook Pro.
To prove the point they then used their brute force technique to make themselves all six HSDirs for Facebook’s .onion site, facebookcorewwwi.onion, for a day (that link only works if you’re using Tor) .
Mitigation and monitoring
Kate Krauss, a spokeswoman for the Tor Project, reassured Motherboard that help was on its way (but hasn’t arrived just yet) in the form of updates to the Tor code:
[the attack] is hard to do without getting caught … with next-generation hidden services, this attack will become nearly impossible
In the meantime, site owners can use the brute forcing technique demonstrated by Valsorda and Tankersley defensively, to ensure that their site’s information is published to HSDirs that they control.
The researchers have released a suite of tools that allow you to interact with and analyse Tor HSDirs.
Separately, two more researchers, from Northeastern University in Boston, Massachussets, recently tried to measure just how many rogue HSDir nodes there might be out there.
For the answer to that you’ll have to read Paul Ducklin’s recent article on Honey Onions.
6 comments on “Can you trust Tor’s hidden service directories?”
Hi! The answer is “Yes you can” – not least because the risk to individual users as-outlined is actually pretty small, especially where using Tor to bypass censorship rather than to provide anonymity. Also: the report is not new, it’s more than a year old, and is either getting, or has been, fixed.
The ‘fix’ is coming later this year. This, as well as methods such as dns correlation, remain a problem, funny, considering the latter attack (dns) is only effective on clearnet sites…I’d be more worried about ‘them’ using their massive network infastructure to determine which hidden services a user has visited, resulting in a red-flag being applied to your NSA file, further triggers of NSA filters, resulting in further automated escalation of your surveillance, not to mention the fact that the FBI are going to be very interested in why you’re visiting a site they don’t approve of, which obviously means you’re a terrorist right? Thanks to #Rule41 and the new #NSADataSharing agreements, the #FBI are free to query the NSA database, with virtually no oversight or record of said query, easily thumbing through the NSA’s massive DBs (metadata, ISP logs, calls, emails, etc, etc.)…Hell, if they’re bored they may even decide to excercise #Rule41 on dat ass and hit you with some NSA exploits/backdoors, turning every device you own into listening posts and much worse (#PRISM partners = Forced backdoors incase you forgot)….SMH…
Those who can’t learn from history are doomed to repeat it.
#ThoughtCrime #Orwell #1984
On a point of order, because you’re asking us to treat this as fact, you have got the George Santayana quote rather importantly incorrect. He didn’t actually say that we ought to “learn from history”. His words were similar, but interestingly and quite pointedly different: “Those who cannot remember the past are condemned to repeat it.”
Santayana’s quote was translated from spanish, besides, it wasn’t meant to be taken literally, remembering something and learning from it are essentially the same thing, are they not? This is how his original statement was framed and hence the reasoning for my statement, which speaks to any possible way of interpreting that quote anyways….Besides, did you see any “”? No?
Smh..this IS typically how important matters are received by the majority of society…”YEA BUT SNOWDEN BROKE THE LAW!” Yea, so did Washington and Anne Franke. Those who believe that we are somehow the most enlightened generation and that the injustices of the past could never happen again are…very very ignorant, willfully so.
You’re prob just trolling anyways…
Remembering and remembrance are quite different from learning. You can get a sense of that if you are in a country such as the UK or Oz at 11am on the 11th day of November.