I've been putting off writing this for a while, mostly because I hope that this should be obvious to people in the security world. However, a few things happened lately and I thought I would finally bite the bullet and type this up. Most of this comes directly from NIST Electronic Authentication Guideline and is well worth a read.
Claude Shannon -
By DobriZheglov (Own work) [CC BY-SA 4.0 (http://creativecommons.org/licenses/by-sa/4.0)],
via Wikimedia Commons
In 1948 a gentleman called Claude Shannon wrote an article called "A Mathematical Theory" which gave birth to a new field of mathematics called Information Theory. It has a profound impact on how we define and use information.
Within Prediction and Entropy of Printed English, Shannon states that:
The entropy is a statistical parameter which measures, in a certain sense, how much information is produced on the average for each letter of a text in the language. If the language is translated into binary digits (0 or 1) in the most efficient way, the entropy is the average number of binary digits required per letter of the original language.In thermodynamics entropy is described as:
A measure of disorder in the universe or of the availability of the energy in a system to do work.
And similarly within Information Theory, Shannon was defining entropy as the uncertainty of symbols chosen from a given set of symbols, based on a priori probabilities. i.e. The probability that a given variable (X) has particular value.
This type of entropy is used to define how difficult it is to guess a particular password. The higher the entropy the harder it is to guess what the password is.,
Why does this matter? Because we're currently in an arms race when it comes to passwords and customers are definitely suffering because of it. In a battle where entropy is the winning factor, humans are always going to lose. Humans aren't designed to be random, we're geared to see patterns. We're really good at patterns, so good at patterns in fact that we often see patterns which aren't there.
Like most things which become a problem, it all started swimmingly...Back in the 80s and 90s, cracking password hashes was a time consuming business. Many hours I spent tuning my word lists and running those passwd and shadow files through Alec Muffett's Crack (I am sure that can be taken in many different ways, so let's just keep moving on). If you had good password security it was possible to be secure and not be too onerous on the user. Today things have definitely changed, using GPUs and hashcat it's billions of hashes which can be computed in seconds.
What does this mean for users? Password hell. We keep pushing to have passwords have higher entropy. It's got to the point where we now have applications dedicated to generating passwords for users and keep track of them. Because for passwords to be secure they really need to be completely random. By the way, I'm not saying you shouldn't use these, you definitely should but we still need a password to protect the other passwords so....
So what's the solution?
So before we go further, let's do some tests, before we can do that we need to define how we're calculating entropy. So if we have a number(a) of values within a set (e.g. the letters of the alphabet) and the length of the (l), then the entropy would be a^l. For example using the alphabet and a length of 6 characters we get 26^6 = 308915776. However normally entropy is expressed in terms of bits so if we take the log2(308915776) we get 28. So to abstract out that formula we have:
Where b is the cardinality of the set which the value can be chosen from.
Where l is the length of the password
Where H is the entropy in bits of the password.
There are 95 printable ascii characters, so let's look at entropy as password length increases:
This really shouldn't be surprising right? As the length increases uniformly the entropy increases along with it. This describes the entropy if there is an equal chance of selecting any member of the set but humans don't select their characters randomly. Typically passwords are selected based off the language which the person speaks, if she is English then most cases she will select passwords based off the English language.
As Shannon pointed out, the English language has a set of properties which significantly reduce it's entropy, capital letters tend to be used at the beginning of a word rather than in the middle of it, certain pairings of letters (i before e except after c, u next to q etc). These conventions and properties of a language reduce it's entropy. This is compounded by the fact that when a user is forced by password policy they tend to substitute i's for 1's, s's with $'s etc, rather than sprinkling these within the password.
All this means that the entropy of real life password isn't anywhere near as strong as a random selection. The excellent NIST Electronic Authentication Guideline has devised a "scoring" system to help determine a more realistic entropy based of human behavior. It's defined as:
• the entropy of the first character is taken to be 4 bits;
• the entropy of the next 7 characters are 2 bits per character; this is roughly consistent with Shannon’s estimate that “when statistical effects extending over not more than 8 letters are considered the entropy is roughly 2.3 bits per character;”
• for the 9th through the 20th character the entropy is taken to be 1.5 bits per character; -49- Special Publication 800-63 Electronic Authentication Guideline
• for characters 21 and above the entropy is taken to be 1 bit per character;
• A “bonus” of 6 bits of entropy is assigned for a composition rule that requires both upper case and non-alphabetic characters. This forces the use of these characters, but in many cases thee characters will occur only at the beginning or the end of the password, and it reduces the total search space somewhat, so the benefit is probably modest and nearly independent of the length of the password;
• A bonus of up to 6 bits of entropy is added for an extensive dictionary check. If the attacker knows the dictionary, he can avoid testing those passwords, and will in any event, be able to guess much of the dictionary, which will, however, be the most likely selected passwords in the absence of a dictionary rule.Using that scheme, they determined the following results:
The results are interesting because as we see while enforcing complexity and testing passwords against dictionaries of known common passwords does indeed increase the entropy for smaller passwords, as the size of the password increases, the dominating factor again becomes the size of the password.
So in a rather convoluted manner we're back to the original proposition, what's the best practice when it comes to passwords?
Well the solution really is bigger passwords, the problem is when people think passwords, they think exactly that; a pass word, like speak "friend" and enter, fame. But this reduces the scope for entropy significantly because 80% of words fall between 2-7 letters long.
Looking at the bell curve, after about 14 characters the increased entropy from dictionary and complexity rules starts to become redundant but that only leaves around 2% of English words we can use. On the bright side we can start looking at Germanic words which fair a lot better but it's probably not going to be a great UX experience trying to enforce that.
So how we do get users to use more secure passwords without making them memorize gibberish? It's really simple, we start the long process of changing how people think of passwords and make them start making them look at using pass phrases.
What's easier to remember? "Palo told us Thales fell down the well." which 40 characters well above our sweet spot or "rb97P)eE4.Y3"? They both have similar entropy but one is more easier to remember than the other.
This is a very long winded way of saying something very obvious but the next time you're making something to ask for a password, maybe give an example as a passphrase rather than a password. It's going to take a while to move people away from thinking about single words. For a long time software has conditioned people to think of passwords that way.
It's common even today to see systems which restrict the use of space or put 12-20 character limits on passwords, it's a tremendous disservice to the industry to do this. There really isn't any need for it anyway as these passwords are going to be hashed so it's not like the hashes are going to get bigger.
This was written as a request by a friend, I would just like to apologize to them how long it's taken me to actually get around to writing this. Hopefully you'll forgive the delay =)