HTML5 allows websites to save data on your hard disk for the next time you visit.
Much like cookies, only different.
The cookie system in HTTP has two big disadvantages compared to what’s generally referred to in HTML5 as Web Storage.
Firstly, cookies are wastefully sent in the HTTP headers of every request made back to the server that set the cookie in the first place.
Secondly, cookies are limited in size, mainly because of the first reason, to about 4KBytes.
In the modern, data-rich web, that doesn’t leave much room to manoeuvre.
The Web Storage system, however, is driven by JavaScript, not by HTTP headers, and is actually surprisingly simple to use.
You just add attributes to the localStorage JavaScript variable inside the browser, and read them back later.
The official W3C standards document offers an example like this:
<p>You have viewed this page <span id="count">an untold number of</span> time(s). </p> <script> if (!localStorage.pageCount) { localStorage.pageCount = 0; } localStorage.pageCount = parseInt(localStorage.pageCount) + 1; document.getElementById('count').textContent = localStorage.pageCount; </script>
The value localStorage.pageCount is used to keep track of how many times you have visited the page, even from one browser session to another, and setting the document.getElementById().textContent attribute makes the counter appear in the page itself.
For security reasons, each domain gets its own localStorage object, so that data can’t leak from one site to another, and for safety reasons, the size of each object is limited.
→ Web Storage also comes in the form of sessionStorage. Each browser window gets its own sessionStorage variable, and, as the name implies, all the values in it are lost when the session ends.
Each domain gets somewhere between 2.5MBytes (Chrome) and 10MBytes (IE) of localStorage to use.
However, as blogger Todd Anglin noted back in 2011:
Some browsers have exposed a workaround that grants "a1.website.com" and "a2.website.com" their own 5MB LocalStorage quotas.
Anglin saw this as a viable way around the quota limit, but also pointed out that:
[this] is specifically frowned upon in the HTML5 Web Storage spec. Browser authors are asked to prevent multiple sub-domains of a single site from being given a bigger localStorage pool.
Anglin therefore advised against this bodge to boost your storage size because it was “likely to break in future.”
But Stanford student Feross Aboukhadijeh recently found that for most mainstream browsers, Anglin’s future still lies ahead of us.
You can still bag extra localStorage using the multiple sub-domain trick.
Indeed, Aboukhadijeh created a web page by means of which you can inflict this trick on yourself, and the results are dramatic.
He can quickly grab gigabytes of your disk space by getting you to visit his one-off domain.
That might not sound like much of a Denial of Service (DoS) attack, but it’s not supposed to happen, for obvious reasons.
And that’s what really matters: that browsers (and network programmers in general) don’t always take specifications seriously.
By the way, Firefox users can relax: your browser already applies a 5MByte limit at the domain level.
Aboukhadijeh says he’s reported this bug, together with his practical demonstration of how easy it is to abuse, to the other browser vendors.
Let’s see how long they take to respond, if indeed they consider it a problem worth fixing.
Isn't there a bug in your code when incrementing the page count in local storage?
Sure is 🙂
I shortened the variable name used in the W3C document for space reasons…but made the changes inaccurately, it seems.
Thanks. Now fixed.
If you use the Chrome feature "history" –> "clear all browsing data" –> and check Clear data from hosted apps and Deauthorize content licenses, does it clear the cache?
The sample code seems to have a bug. The next to last line should have
localStorage.pageCount
on both sides of the equals sign
See reply to @hey above.
Thanks for spotting the error 🙂
That's funny. I don't remember cookies being part of any of the HTML standards.
You're right. I used the term "HTML" as if it meant "web related data stuff", which was a bit too loose.
I have changed the text "HTML" to "HTTP" in order more closely to reflect where cookies happen (if that is not exchanging one loose way of speaking for another).
If HTML 5 is able to store large files on my computer, what stops these files from containing malware?
The data blobs stashed in localStorage can contain pretty much anything, including malware.
Of course, that malware could only be retrieved and used later inside your browser by more JavaScript served from the same website.
So anything malicious it could deliver, save in localStorage, and then use later, could be delivered and used right away without the "side trip" through localStorage.
You can perhaps imagine some scenarios where delivering malware in pieces, visit by visit, and then stitching it all together later might make detection slightly harder (you never see the entire malware warhead in one piece until the last minute).
But I don't really see how it could make the risk much worse, if at all, than exploit packs which just target-and-infect in one stroke…
It’s easy to see why this workaround was implemented by the browser vendors. The HTML5 standard makes a seemingly arbitrary decision on behalf of the user that Availability of their disk space trumps the Confidentiality or Integrity of the information in LocalStorage.
Given that siteA dot example dot com might be a trusted site whilst siteB dot example dot com may be a freshly squeezed attack site, preventing access to the former’s LocalStorage by the latter rather than assuming trust on the basis of a shared domain seems a prudent control to have in place.
Moreover, the trade-off between the risk of a DoS through disk-space exhaustion against risk of snooping or modding of another subdomain’s stored data seems an artificial one.
If the browser alerted and asked for confirmation when subdomain storage exceeded e.g. 50MB total for a domain, then again at every 50MB increment, the disk exhaustion attack wouldn’t be viable without user interaction. A message like “The site exhaustionA211 dot example dot com is requesting 5MB of disk space; the domain example dot com already has 1050MB allocated to other subdomains. Are you sure you want to permit this?” should do the trick.
That way the separation of subdomain data could be maintained and we could have our cookie and eat it…
And so wouldn't the workaround for Firefox be to avoid subdomains & just point to iframe/js on other domains under my control?
And since they would be under my control, the whole issue of accessing one-another's stored data is moot. Instead of accessing via the browser-agent/localhost, the data would be passed between domains back at the server, right?
This gives me the same willies as does html-enabled email.