Passwords continued...

I am known to frequently carry on and on about passwords, but it is time to do so again. There is a widely popular comic on the web, basically explaining that all the advice given to people to choose strong passwords are incorrect, as people end up having to choose passwords that are almost impossible to remember however easy to brute force.

As an example, take this password:

!Q@w#e$r%T

Does it look strong? Sure you say. Based on what I wrote here, it is 10 characters long, has combination of all the character classes, does not seem obvious, so it should be secure, right? Wrong. This is an example (similar to, not the same) from a production environment representative of root user passwords.

The problem with that password is not complexity - brute forcing that password will not happen in our lifetime with current technology. Thing is, today hackers use large dictionaries and rule sets to crack passwords. In this case, the password exhibits a pattern on the QWERTY keyboard. Look closely, there is a clear pattern how to type that in. And once it follows some logic, it can be expected hence anticipated for inclusion in dictionaries / rules. Leaks such as the Linkedin, Yahoo, eHarmony etc. are much more than what it seems. Hackers from all over the world invest a lot of time and effort to crack the passwords, mainly to understand how the human brain choose passwords. So if a site is hacked that does not salt their passwords for instance, a rainbow table can be used to quickly crack all the passwords. The patterns and structure of those passwords then get fed in to the new dictionary lists and rule based engines, making them much stronger. Substituting 0 for o and 1 for i is no longer effective - in fact, it adds maybe one second to the time to crack your password. No matter what you do, if it can be expected it can be coded for and cracked.

By the way, I cracked that password above in 2 seconds on my system that runs through about 200 million passwords / second - rather slow by modern standards.

The problem I want to address is the huge amount of misinformation spread throughout the web regarding passwords. Lets take a look at some of them:

Microsoft - I typed in passwordpassword and the system told me it is strong. It took me 2 seconds to crack it. It also told me the password in the opening paragraph is Medium strength.
This site suggests this is a good password: tlpWENT2m and to an extent I agree, it is 9 characters and hard to guess. Yet the password validator above suggests it is Weak.
Try another site, and passwordpassword now scores 11%, which is good. However !Q@w#e$r%T scores 100%, yet I cracked it within 2 seconds. aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa is considered 0%, very weak. Yet I guarantee you nobody will crack it.
1Password considers !Q@w#e$r%T to be an excellent password.

1Password considers e^4AQF a "Good" password:

yet any random 6 character password in the printable character set can be cracked within 13 minutes on mid range hardware using brute force techniques: adea7c09f7f73335dfadf4884923226e:e^4AQF

Status.......: Cracked
Input.Mode...: Mask (?1?1?1?1?1?1)
Hash.Target..: adea7c09f7f73335dfadf4884923226e
Hash.Type....: MD5
Time.Running.: 4 mins, 23 secs
Time.Util....: 263506.8ms/143.1ms Real/CPU, 0.1% idle
Speed........: 924.0M c/s Real, 929.7M c/s GPU
Recovered....: 1/1 Digests, 1/1 Salts
Progress.....: 243481133056/735091890625 (33.12%)
Rejected.....: 0/243481133056 (0.00%)
HW.Monitor.#1: 0% GPU, 58c Temp

This site tries to tell you that long pass phrases are useless.

So what does this mean?

Do not rely on password strength validators for any useful information.
Do not assume just because a password is long and looks random it is secure.
Most people have no clue what a strong password is.
The definition of a "strong" or "good" password is relative. Does it refer to how well a password will hold up against offline brute force attacks against its hash, or does it refer to how well it holds up against online attacks, such as your sister trying to log in to your gmail account. This article speaks to the latter case - i.e. worse case scenario. For avoiding purely online password guess attempts the bar can be lowered considerably. In that case a 6 character random password is next to impossible to crack. It will take 23 years to crack such a password if only 1000 guesses per second can be made.

There are two sides of this coin to consider. The first being that most people do not see the value in protecting their passwords. The second being people choosing poor passwords as they do not understand the mechanics of password cracking.

Lets first consider the social issue. Many people I speak to, this includes business people, tell me things such as:

We have not been hacked yet, therefore there is no reason to believe this would change in the future.
I do not store confidential information in my account so I do not really care about my password being compromised.
The system I use does not allow more than 5 login attempts before it locks the system, therefore nobody can guess my password.
Why would anyone care to hack my account? I am nobody.

Unfortunately these comments are not based on reality. In reality:

As time goes on you are only more likely to get hacked. This is a simple law of probability. The longer you exist with your digital presence the larger the chances are of being targeted. It is like driving your car. Saying 3 years after buying your car that since you have not yet had a flat tire it is safe to assume you will never have a flat tire is just plain nonsense.
You may not store confidential information but consider these two aspects:
- You might be sharing passwords across accounts - most people do, hence if the attacker compromises your gmail account then they might be able to log in to your bank site too. I am pretty sure you would mind that.
- You might not think you store confidential information but think about this: If someone steals your Facebook details they know an awful lot about you and can transition that attack to a physical attack. If someone steals your gmail account they can reset other accounts' passwords that you might have as the password reset emails would perhaps go to your gmail account.
Thinking that system lockouts would help is not entirely ridiculous. However if someone cracks your password on another site and you shared passwords, they will enter the correct password the first time. If they compromise the site and steal the password hashes they can attack it offline at a rate of 50 billion passwords per second easily. And if your password is even slightly obvious / weak, a long term paced attack can find that password. If a hacker tries 4 times and waits for the exact duration your system needs to reset its password count, and continue for a couple of weeks it might just crack it. This is what happened in real world with some FTP accounts on an old system I used to manage.
You might be nobody but you exist in the digital world, therefore you will be a target sooner or later. Most of the time you would not be targeted because of who you are, but rather because of random probability. There is a population of 7 billion people that theoretically has the capacity to hack your account. It goes like this - Linkedin is a large public site with many users. So it is an obvious target. A hacker manages to steal the password hashes of all user accounts and starts cracking them and selling those passwords to the highest bidder. The fact that you happen to have a Linkedin account is of no relevance to the hacker - he does not care as he does not know you. However someone wants to use a Linkedin account for nefarious purposes. So he picks yours. When he logs in he happen to see you mention Facebook, so he tries to log in to your Facebook profile using the same credentials. Since you shared passwords, he gets in. There he sees you are with ABC bank, as you made a comment regarding their ridiculous service. He tries to log in there and sees he can't - different password. So he targets you now because he has a lot of personal information. Taking personal information from Linkedin and Facebook he builds a portfolio of your digital presence. Next he calls the bank and authorises himself as a beneficiary because he knows a lot about you and can answer your security questions. He takes all your money. Why? Because he can.

Finally, lets consider the issue of poor passwords - something even seasoned IT professionals are struggling with:

The most secure password is a long, truly random string of mixed characters generated by a good password management utility, and stored with an association to the account you used it. Each account you own should have its own unique password. For now a length of 10 characters is more than adequate to keep everyone out assuming:
- Proper hashing of the password occurs on the site in question, coupled with salting the password.
- Not using a weak hashing algorithm like MD5 which is known to suffer from collisions.
- The password you choose is truly random.
Assuming all those conditions are true, then it will take an affordable yet high end system more than 10 000 years to recover your password. Remember this password is purely random so assuming it does not happen to be a nice dictionary word (remember, the word sophisticated is 13 characters long, and is one of the many potential random combinations one will get with a true random password generator, even though to us it does not look random and will be cracked with a dictionary attack quickly), the only way to crack it would be via brute force - i.e. trying all combinations one by one. Since computing power supposedly double every 18 months, in 10 years time computers should be 100 times faster than now. Which means it will still take 1000 years to break the password - in my book that is still secure for all but government needs. A truly random password might look like this: 1(w8Ymq"c A dictionary based attack failed. Brute force will take much more than 10 years (the application I use stops at 10 years).
The problems with the above recommendation are
- It is hard to remember, especially for multiple accounts. This can be circumvented using a password manager.
- Not all sites accept arbitrary passwords. Many sites, especially banking sites, limit passwords to case insensitive, no special characters etc. This significantly reduces your options. As an example, one site I know enforces case insensitive passwords, 8 characters or shorter and no special symbols. On my system that key space can be exhausted in 53 minutes. Best you can do is to still choose a random password and protect your user name too, and ensure you do not re-use that account.
- Without your password manager you cannot log in to a site.
- Typing random letters is hard on a mobile device - without your password manager it might frustrate you.
So the second option is to use pass phrases. Despite some papers to the contrary, if done right simple math tells you that a properly chosen passphrase can be easy to remember and hard to crack. Consider the following:

A random password consisting of 10 characters in the set a-z, A-Z, 0-9, special characters has an entropy of about 66 bits (that set of characters consists of 95 unique characters, and the entropy per character is given as log₂(n). Since n is 95, the entropy of a single character is 6.6 bits). I just said that this is basically unbreakable using today's computing power.

If you treat a word as a character, and use a couple of truly random words in a pass phrase, with very few words you can create an incredibly strong password. If you have a word list such as Diceware that consists of about 7776 words, each word will have an entropy of log₂(n) which is 12.92 bits per word. Since we are using words as characters, and since the entropy per word is so much bigger than per character, we need fewer words. It is as easy for a human remembering a word as remembering a single character, yet for a computer this is not true. So if you choose 4 truly random words, say

dog cat patio peach

then the total entropy of that phrase is 51.68. It sounds low, and probably is too depending on the words chosen. That can be cracked on top end hardware in about 10 hours provided the correct dictionary is used (Diceware in this case). Change the passphrase to

dog cat flies pasta

and suddenly it gets much tougher, as two of those words are not in the diceware list. So now the attacker has to expand his word list. Lets say he chooses a dictionary containing the most common words, lets assume this is a 50 000 word list dictionary and that it contains these words listed above. Suddenly the entropy per word is 15.6, meaning that passphrase is now 62.4 bits strong, which will take 2 years to crack. Spice it up by including an uncommon word and you are quite safe. Lets use this passphrase:

dog skirr flies run

The word "skirr" is very uncommon, and to catch that you need to use a larger dictionary. Currently it is estimated that there are between 250 000 and 1 000 000 english words. Lets take an average - 500 000. Lets assume this is a well represented dictionary. If that is my dictionary, cracking the password above suddenly involves 18.9 bits of entropy per word, which is 75.6 bits total entropy which will take 20 000 years to crack assuming (like in all instances above) one can try 100 billion passwords per second.

Since hackers would always take parallel approaches, we need to ensure the password is long enough. This one is 19 characters long, all lowercase. It will take half a billion years to crack, even though it only consists of lowercase characters, via brute force. We are OK.

One thing to keep in mind though - this works only of you do not make the !Q@w#e$r%T mistake with pass phrases by choosing something obvious. This is BAD:

twinkle twinkle little star

as it is a common phrase. Even this is bad:

people love to dance

because it is grammatically sound English, and that introduces a pattern that can be anticipated and countered by refining the way the hacker combines words from the word list. Verbs will follow nouns for instance, significantly reducing the entropy. The whole point is to choose purely random words that makes no sense to any human.

Take note that the password below does not increase entropy and actually is easily crackable:

d0g cat P4tio

Common substitutions does not make the password stronger.

A great password should be a truly random combination of uppercase and lowercase letters, digits and symbols, be at least 10 characters long and should not be shared between multiple accounts. Example: 1(w8Ymq"c

Another great password is a truly random combination of at least 4 words, ensuring at least one word is uncommon (not by virtue of fancy substitutions but rather just not likely to be found in a common word list), and ensuring the total length of the password exceeds 15 characters. Also, it should not be re-used between accounts. Example: dog skirr flies run

If you are trying to protect against online attacks only, a truly random 6 character random password will be as strong as a 10 character password. Likewise, a truly random 3 word pass phrase - even using common words, as long as it consists of at least 9 characters - will be as strong as a four word pass phrase with an uncommon word in the mix. This is based on the assumption of a maximum of 1000 password guess attempts per second.