Dear reader, it seems that you are causing headaches in dark corners of the web.
I pinpoint you specifically, as a reader of Naked Security, because I assume that if you’re a regular to this site then you’re more likely than most to care about who’s watching you online.
For the people trying to track you, profile you and sell to you, you’re a problem.
Historically, techniques for tracking people’s movements around the web have relied on HTTP cookies – small messages that ‘tag’ your browser so it can be uniquely identified.
Unfortunately for snoopers, profilers and marketers, cookie-based tracking leaves the final decision about whether you’re followed or not in your hands because you can delete their cookies and disappear.
It’s no secret that some vendors have moved on from cookies – local storage, Flash cookies and ETags have all been used in-the-wild, either as cookie replacements or as backups from which cookies can be ‘respawned’.
These techniques have been successful because they’re obscure but they all have the same fundamental weakness as cookies – they rely on things that you can delete.
The holy grail for tracking is to find a unique ID that you can’t delete, something that identifies you uniquely based on who or what you are, not what you have.
In July I wrote about Panopticlick, a fingerprinting tool that does exactly that. It was created by the Electronic Frontier Foundation (EFF) for its research paper How Unique Is Your Web Browser?.
Panopticlick asks your browser a few questions, such what fonts you have installed, what HTTP headers your browser sends, your screen size and your timezone.
That collection of information varies so much from one browser to the next that it’s enough to tell any two browsers apart with startling accuracy.
The EFF used Panopticlick to show that in the population of internet users it tested (a group likely to be more privacy concious than average) users had a 1 in 286,777 chance of sharing their fingerprint with somebody else.
That’s certainly good enough to use as a fall-back ‘respawning’ technique but perhaps not good quite enough to work as a cookie replacement.
Since Panopticlick was only designed to show that fingerprinting was viable it didn’t exhaust all the possible browser features that might be exploited for truly bomb-proof fingerprinting.
That such unexplored features exist was alluded to by the authors in their conclusion (my emphasis.)
We implemented and tested one particular browser fingerprinting method. It appeared, in general, to be very effective, though as noted in Section 3.1 there are many measurements that could be added to strengthen it.
Fingerprinting beyond the browser
As chance would have it, at the same time as I was writing about Panopticlick, a well known internet company with a foothold on 13 million websites was caught experimenting with one of those ‘missing’ techniques; canvas fingerprinting.
AddThis is the internet’s premier purveyor of social media sharing widgets.
Its code is embedded in millions of websites, which gives it a huge platform on which to run its anonymous personalization and audience technology.
Between February and July 2014 that technology included a live test for a canvas fingerprinting technique.
To illustrate the point I’ve included two pictures of the letter T below with its SHA1 hashes. One was rendered by Firefox 33 on OS X and the other by Safari 8 on the same machine.
Often the most sensible and efficient way for web browsers to handle canvas graphics is to hand over font rendering and 2D compositing to the underlying operating system and hardware GPU.
Different graphics cards and operating systems work slightly differently, which means that different browsers given identical instructions on what to draw will draw slightly different pictures.
In 2012, researchers Keaton Mowery and Hovav Shacham published a research paper entitled Pixel Perfect: Fingerprinting Canvas in HTML5 which showed that there was enough variation to create a reliable browser fingerprint.
In their own words:
...the behavior of
<canvas>text and WebGL scene rendering on modern browsers forms a new system fingerprint. The new fingerprint is consistent, high-entropy, orthogonal to other fingerprints, transparent to the user, and readily obtainable.
Remarkably, they didn’t have to try very hard to tease out the differences between graphics cards…
Our experiments show that graphics cards leave a detectable fingerprint while rendering even the simplest scenes.
…nor the way that even common fonts are rendered.
Even Arial, a font which is 30 years old, renders in new and interesting ways depending on the underlying operating system and browser. In the 300 samples collected for the text_arial test, there are 50 distinct renderings.
Since the technique relies on rendering pictures you might think that there would be something you could see that gives the game away, right? Not so.
Our tests can be performed, offscreen, in a fraction of a second. There is no indication, visual or otherwise, that the user's system is being fingerprinted.
Finally, the messy business of comparing pictures is neatly accomplished by converting the picture rendered on the canvas into a string of base64 data (using the
toDataURL() method) and running it through a hashing function to create a short, fixed length ID.
This makes dealing with canvas fingerprints almost as easy as dealing with cookies.
Mowery and Shacham estimated the entropy of their fingerprint to be about 10 bits, which is impressive but fewer than the 18.1 bits found in the Panopticlick fingerprint.
Just as the Panopticlick researchers did, they conclude that there’s more entropy to found:
We were surprised at the amount of variability we observed in even very simple tests ... We conjecture that it is possible to distinguish even systems for which we obtained identical fingerprints, by rendering complicated scenes that come closer to stressing the underlying hardware
Fingerprinting in the wild
The potential for canvas fingerprinting was obvious but Mowery and Shacham had only shown that it was possible, not that it was being used in the real world.
In 2014, a group of researchers from Princeton and the University of Leuven set out to see if canvas fingerprinting was being used in the wild.
They crawled the home pages of the 100,000 most popular websites and found 20 distinct implementations of canvas fingerprinting.
Nine of them appeared to be home-brewed implementations unique to a single site while 11 of them were third party scripts shared across a number of sites.
The lion’s share of the sites they found though, some 95% of the 5,542 unique sites that were using canvas fingerprinting, were using code provided by AddThis.
I should be absolutely clear that neither site owners nor users were aware that they were part of an AddThis test bed.
The AddThis code that the researchers found was to provide social media sharing functionality and the fingerprinting code bundled with it unannounced was being used by AddThis for its own ends, and not by its customers.
The results of the research were published in a paper, The Web Never Forgets, in July 2014, and caused a bit of a stir in the computer security press.
By a happy and remarkable coincidence, the six month “preliminary initiative to evaluate alternatives to browser cookies” ended at exactly the same time.
AddThis came clean in a blog post shortly after concluding the test and was at pains to reassure users that their privacy had been protected.
... this data was never used for personalization or targeted advertising.
... We don't identify individuals ... and we honor user opt-out preferences any time we act on our data.
... We adhere to industry standards, and have an opt-out process that complies with our membership in the NAI and the DAA. We honored our opt-out policy during this test, and the data was only used for internal research.
In the comments, a representative from AddThis revealed that the test wasn’t wrapped-up as a matter of conscience, or even damage limitation, but because it didn’t work very well.
Had the identification actually been good, we would have kicked off a whole new investigation ... But given the results, we're halting the project.
Disappointingly, the post also seeks to justify the company’s actions by invoking an excuse familiar to parents of teenagers the world over – everyone else is doing it, so why can’t we:
Many other companies are working on cookie alternatives, and we wanted to see if this approach worked.
The bottom line
What AddThis didn’t address in its mea culpa is the fundamental thing that makes fingerprinting and other exotic tracking techniques so obnoxious:
They only exist to rob users of the ability to control who tracks them.
Cookies provide a perfectly decent way to identify users – they’re reliable, benign, well understood by users, easy to implement and easy for users to control.
The only ‘problem’ that super cookies, evercookies, fingerprints and other methods ‘solve’ is that of users having opinions about who tracks them.
Users who delete cookies are sending out a clear message that they don’t wish to be tracked. Vendors who use fingerprinting are looking for ways to drown out that message.
How to protect yourself
Fingerprinting is a viable alternative to cookies that’s being used in the wild.
The techniques shown by Mowery, Shacham and the EFF are individually useful but both sets of researchers pointed to ways their techniques might be made better still. The most obvious way to strengthen either technique is to combine it with the other since the two don’t overlap.
That work has already been done and an off-the-peg fingerprinting library that incorporates both techniques is available for free on GitHub.
Existing counter-measures are of limited use; Private Browsing and Incognito mode don’t alter a browser’s fingerprint and, according to the author of the code I mentioned above, they have no effect.
Privacy conscious users who deploy browser plugins to manage cookies and other tracking mechanisms are also likely to make their fingerprints more distinct, not less.
There is no single, good way to protect yourself but there are things that you can do to make your fingerprint less distinct.
Privacy plugins like Ghostery should protect you from fingerprinting code served from known, third party domains used for advertising or tracking.
Tor also asks for a user’s permission before giving websites access to data on canvas elements, which completely disrupts canvas fingerprinting. The same functionality is available in plugins for Chrome and Firefox.
The EFF is also promising that future versions of its PrivacyBadger plugin will include countermeasures against fingerprinting.
16 comments on “Browser fingerprints – the invisible cookies you can’t delete”
Fascinating stuff, thank you 🙂
Scary stuff. One thing though – will this sort of thing work on, say, an ipad, where the system is pesumably much more standardised? I’m guessing websites can’t get at things like what apps you have or anything like that, hopefully couldnt read in-browser stuff like bookmarks either, so maybe limited to timezone info and a few other minor things like that?
Mobile devices and tablets tend to have more generic fingerprints for exactly the reason you describe. And yes, websites can’t get at apps or bookmarks.
That said, my take on the research so far is that it’s only been looking to see if fingerprinting is possible. For researchers there’s a cost to digging deeper but for trackers there’s a commercial incentive so it’s quite possible there are fingerprinting techniques that are being used that haven’t been uncovered by research.
My guess is that we’ll see a range of techniques deployed together in future. Flash cookies are probably better than fingerprints for tracking because they can track you across multiple browsers but fingerprints might be a useful fall-back for re-spawning them if they’re deleted.
We might also see different techniques deployed against different types of device. I can easily determine your fingerprint but I can also easily determine how distinct it is so if I see you have a generic tablet fingerprint I might send you ETags instead.
Fingerprints also change over time and although Panopticlick showed that you could still track people with some reliability as it changes, it would be a lot easier if it were combined with something else.
That said, more than one mobile company has been found adding long, unique IDs to all out-going HTTP requests which would do the job just as well as a fingerprint. In one case that unique ID was the phone number: https://nakedsecurity.sophos.com/2012/01/25/smartphone-website-telephone-number/
I’ve no idea if these IDs are being used for tracking (other than by the companies who put them there obviously) but Verizon have something like 120 million customers so it’s not too much of a reach to imagine they are.
You said: “Flash cookies are probably better than fingerprints for tracking” while it’s true, may be you should update your post as Adobe Flash is going to die because of all the recent (again) security vulnerabilities it suffers from.
Regards from BEGUERADJ.
I’ll let your comment serve that purpose!
The broader point is that it’s not an ‘either or’. I expect tracking to be done using code libraries that use combinations of techniques depending on what’s available on the target device.
As Mark Twain famously said, “The report of my death was an exaggeration.”
(The fact that you needed to use the word “again” in your comment is a bit of a warning sign eh? 🙂
One editorial quibble: In the 15th paragraph, the sentence “That such unexplored features exist was eluded to by the authors in their conclusion” should use “alluded” rather than “eluded.”
Thanks for flagging. Now fixed!
[If you visit a website that utilizes canvas image fingerprinting,] Tor warns you with the following message: “This website (www.example.com) attempted to extract HTML5 canvas image data, which may be used to uniquely identify your computer. Should Tor Browser allow the website to extract HTML5 canvas image data?”
Indeed. You might also say something like… “Tor also asks for a user’s permission before giving websites access to data on canvas elements, which completely disrupts canvas fingerprinting.” ; )
One last question if I may: do bookmarks help to identify the fingerprint of a browser?
Thank you in advance.
I don’t know of any fingerprinting techniques that use your bookmarks.
I suppose if a site could get you to save a unique URL in your bookmarks (it can prompt you to, but not make you) and you then used that bookmark to visit the site (which you’d have to choose to do) it would be able to tell it’s you and nobody else.
You’re probably thinking of Local Storage (or Web Storage, it has several names), a feature of HTML5 that allows sites to store megabytes of data on a client’s computer. If that’s what you’re thinking of then it has no specific relationship to bookmarks.
There’s a bit more on Local Storage under ‘Super Cookies’ on this page:
Is there a major difference in the type of information you can abstract with Canvas vs webGl or are they essentially reading the same information?
Tor is definitely the best browser for private browsing. It even warns the users while maximizing the size of browser window. The size of the browser window can be used for browser fingerprinting. So, it’s advisable to leave the default browser size while using Tor or any other private browsers.