Logon security company Duo recently found a rather worrying flaw in its own authentication gateway.
A bit of digging revealed that the flaw was reflected in many other so-called single-sign on (SSO) applications, thanks to a problem in handling the underlying “authentication language” that has become a standard for products in this space.
Duo disclosed the problem responsibly late last year, and after giving vendors – including itself – time to fix the bug, has now gone public with an excellent and educational explanation of what went wrong.
In the vocabulary of SSO, network authentication uses dedicated authentication servers, known as IdPs (Identity Providers), to validate requests from client software (users) for access to servers on the network, known as SPs (service providers).
This means that you don’t need to program an authentication module, or maintain a separate password database, or run yet another two-factor authentication service, for every server.
In the jargon, you use an SSO IdP server to handle usernames and passwords for all the other SPs on the network.
Of course, if you want various clients and SPs from different vendors to work cleanly together with an IdP from yet another vendor, you need a uniform data language and vocabulary for them to communicate.
One such language is SAML, short for Security Assertion Markup Language.
SAML is a dialect of XML, which is a sort-of tidied-up form of HTML, the language used to create web pages.
Now, if you have written software or scripts that generate web pages in HTML format, you’ll know that it’s gloriously simple to do – you just stick the right tags at each end of each sentence in bold, each web link, each paragraph, each item in a bulleted list, and so on.
But if you have ever had to write software to go the other way – to read in HTML or XML and make sense of it – then you will know where this article is going.
Generating HTML and then reliably reading it back in are as far apart in difficulty as being able to utter enough badly-pronounced words in a foreign language to find your way to the train station, and being able to chat fluently with a native speaker.
What the bug looks like
Duo did us all a favour by producing a stripped-out representation of the parts that matter in an SAML authentication response; we’ve followed their synthetic example here.
An SAML response typically contains an XML-formatted assertion that identifies the authenticated user, something like this:
<Assertion ID="ABC1245"> <Subject><NameID>firstname.lastname@example.org</NameID></Subject> </Assertion>
There should also be a digital signature for the assertion (here identified by the string
ABC1245), without which an imposter could simply copy a SAML response, and casually alter the
NameID to refer to a different account:
<Signature> <SignedInfo><Reference URI="#ABC1245"/></SignedInfo> <SignatureValue>digital sig of assertion ABC1245</SignatureValue> </Signature>
The problem that Duo found was how various programming libraries – including
python-saml, used by Duo,
saml2-js – dealt with XML comments inside SAML data structures, and how these comments affected the digital signature process.
Above, the correct data string for the field
NameID is obviously
email@example.com, being the full text immediately between the start tag
NameID; and the end tag
But if you were to write this instead…
<Assertion ID="ABC1245"> <Subject><NameID>firstname.lastname@example.org<!-- comment -->.test</NameID></Subject> </Assertion>
…what’s the correct value for
NameID, given that the text
<!-- comment --> is supposed to be ignored?
Duo found that buggy SAML libraries would read the
NameID string in various ways, sometimes as
email@example.com (treating the comment as a terminator for the data field), and sometimes as
firstname.lastname@example.org (simply treating the comment as it it were not there at all).
Either interpretation has technical validity, and it doesn’t really matter which approach you choose as long as you are consistent.
Duo found that wasn’t the case: buggy SAML libraries would use the interpretation
email@example.com when validating the signature, but the second interpretation when matching the username.
In other words, by injecting a comment followed by some extra text into the
NameID field of a signed SAML response, a crook could alter the username in the authentication message without invalidating its digital signature.
As a result, the altered response would pass muster, thus potentially tricking servers on the network into trusting an unauthorised user.
What to do?
- If you use an SSO system in your business: check with your vendor if it is SAML-based. If so, ask if it is affected and whether there is a patch available.
- If you are a vendor with any product that speaks SAML: check with your programmers which SAML libraries you use, and whether they need patching.
Finally, at the risk of sounding impractically pompous, re-evaluate everywhere that you’ve used an XML-based approach to data when you didn’t need to.
As a wise man once said, “There is no limit to how much worse you can make a computer security problem by using XML in the process of solving it.”