Move over, Naked Security’s video on how to Pick a Proper Password!
Step aside, XKCD’s harder-than-you-think CorrectHorseSomethingSomething.
There’s a new sort of password in town: the iambic tetrameter.
In a word, poetry.
To explain: an iambus is a rhythmic unit in verse that consists of an unstressed syllable, followed by a stressed one.
ig-NITE ex-PLODE ka-BOOM
And a tetrameter is a line of verse with four rhythmic units, more properly known as metrical feet.
In other words, iambic tetrameters go something like this:
A BI-cy-CLE is NICE and QUICK But CHAN-ging GEAR is QUITE a TRICK. If CHAN-ging GEAR is HARD to DO Then SIN-gle-SPEED's the THING for YOU. And WHEN it COMES to SLOW-ing DOWN A FIXED rear COG is GREAT in TOWN. Just NEV-er LET your CHAIN get SLACK, It's NOT like RID-ing ON the TRACK. A SLOP-py CHAIN will SOON deRAIL, And LOCK the WHEEL. An EP-ic FAIL! And DO be WARNED: your FRIENDS will JOKE, "You're SUCH an URB-an HIP-ster BLOKE."
Because of the rhythm, typically accompanied by rhyme, and maybe even a melody to carry the story along, many of us find it much easier to memorise verse than prose.
Here’s a road law that’s memorable, for example:
The driver of a vehicle shall stop at the request of any person leading, riding or driving any cattle, horse, ass, mule, sheep, goat, pig or ostrich.
But the above sentence isn’t that easy to memorise, even if you try to imagine a visual clue, because:
As Homer wrote, or maybe said, It's verse that sticks inside your head.
Linguistic passwords to the rescue?
So, poetry is what two computer scientists at the University of Southern California (USC) have been trying as a technique for generating passwords that satisfy two requirements:
- Passwords should be complex, and measurably so. (In the jargon, they should have provably high entropy.)
- Passwords should be easy to memorise.
Hitting both of these targets is something of a holy grail for password creation tools, and here’s why.
Complex passwords are easy to construct.
For example, imagine that you want 60 bits of entropy, so that each password is literally a 1-in-260 random choice (260 is about one million million million).
Twelve characters from the set [A-Z0-9] (upper case letters plus digits) gives you 3612 choices, which meets your needs, because 3612 is greater than 260.
But then your passwords look something like this:
TTJGJOIYGIH2 EPNVF13LXVV9 21K0JV1MFISH MFP8QRAB749X H1IRVIHZL41I
So these passwords fail at the second hurdle, namely being easy to commit to memory.
Similarly, readily-memorised passwords are easy to construct:
PASSWORD 123456 CHANGEME
Heck, you can easily extend them to twelve characters to make them nice and long:
PASSWORD9999 123456789ABC CHANGEMESOON
But these passwords just aren’t complex enough for real life.
Automated password cracking tools typically start with a list of really obvious passwords, and then try a list of obvious modifications to each of those obvious passwords.
As a result, semi-obvious passwords, and even semi-non-obvious passwords, often get cracked much faster than you might expect.
60-bit “natural language” passwords
So, our USC researchers decided to work on algorithms that take a randomly-chosen 60-bit number, and use various techniques to turn it into one of four types of password.
They tried XKCD-style passwords consisting of four words randomly chosen from a dictionary, where the dictionary effectively operates as a giant alphabet:
fees wesley inmate decentralization photo bros nan plain embarrass debating gaskell jennie
They also tried producing 15 random letters, and generating a vaguely meaningful sentence where the words start with each letter in turn, so that the password effectively acts as a sort of mnemonic for itself:
IMMTOUPRSILLMCN: It makes me think of union pacific resource said it looks like most commercial networks. SCKTWRDSYDFCTAC: Some companies keep their windows rolled down so you don't feel connected to any community.
Then they tried what they call the frequency method, where whole words are chosen randomly via a lookup table, thus producing realistic sentences like the ones above, but generally shorter than 15 words, and thus quicker to type in:
Fox news networks are seeking views from downtown streets. (9 words) The review found a silver tree through documents and artifacts. (10 words)
And finally, they went for iambic tetrameters, in rhyming couplets:
Joanna kissing verified Soprano finally reside Diversity inside replied Retreats or colors justified
Note that the algorithms they created don’t just turn out likely candidates, but produce a specific, repeatable passphrase for every number from 0 to 260-1.
Indeed, they can go backwards uniquely from each passphrase to the number that is its equivalent, in order to be sure that each password is equally likely to be chosen, and thus that their entropy really is 60 bits.
Which way is best?
The trials our authors conducted are where things get interesting.
→ Take these with a pinch of salt, because the sample sizes are very small, with only about 10 people completing the memorisation process in each of the four categories, for 44 participants overall.
The methods that users said they preferred came out in this order:
- First-letter mnemonics. (15-word passphrases)
- Frequency method. (Approximately 10-word passphrases)
- Poetry. (Two rhyming iabmic tetrameters)
- XKCD method. (Four unrelated words)
However, when it came to how well people actually remembered later on, the results were almost upside down:
- Poetry: 61% recall.
- XKCD method: 58% recall.
- Frequency method: 40% recall.
- First-letter mnemonics: 33% recall.
So can we really, truly say/That password verse might save the day?
Could poetry (OK, doggerel) be the answer to our password complexity needs, at least for English speakers?
Sadly, the systems that accept our passwords simply aren’t universally ready for any of the above techniques.
Microsoft and Google, for instance, insist that your password is 16 characters or shorter.
Presumably they’ve decided that none of us will reliably remember anything longer, even if it does take the form of a witty jingle that we are much more likely to remember than a completely random text string.
Some sites like apparent variety rather than genuine entropy, so they’ll let you have a 6-character password, as long as you mix in at least one upper case, lower case, digit and punctuation mark.
In other words, poems just don’t fit: they’re too long, and don’t have any wacky characters in them.
So you might end up truncating your long and probably unguessable poetic password to something like this:
And then deliberately introducing the needed non-letters:
Which not only feels weaker, but is much less cool than a rhyming couplet.
As for why Microsoft and Google insist on 16 characters as a maximum, when your typed password is typically salted-hashed-and-stretched into a fixed-length string anyway…