What is the optimal password length

How to choose a password that best protects you during a data breach

10 mins

How strong a password should be

The more the better, of course, and with a password manager it’s trivially easy to generate and autofill a password of any length. But should that be something with hundreds of characters just to be sure, or is there a sensible lower limit to use as a rule of thumb?

Here is a typical password generator interface:

Note the length part which can go from around 8 all the way up to 100, while other ones can go much higher. What is a good setting to use for passwords?

A good password is all you have during a data breach

But to understand what a secure password is, let’s see what happens on the other side!

When you create an account, the service you’re registering to stores the password in one of many forms. It can put it directly into the database (also known as plain-text) or hash it using one of the many available algorithms. Some of the most used hash algorithms include:

  • MD5
  • SHA-1
  • Bcrypt
  • Scrypt
  • Argon2

The upside of storing hashes instead of the passwords themselves is that the passwords are not in the database. And they shouldn’t be, as you only need to prove that you know yours but it does not matter what it is. When you log in, the provided password is hashed with the same algorithm and if the result matches the value stored then you’ve proven you know the password. And in the case of a database breach, passwords are not recoverable.

Storing hashesuseruserserviceservicedbdbregisterPa$$w0Rd19925e8539...loginPa$$w0Rd19925e8539...OKOK

Shameless plug

AWS S3 signed URLs handbook -
How to handle files in a serverless environment

Learn the ins and outs of file downloads and uploads in a serverless environment

Get the book

Password cracking

Password cracking is when an attacker tries to reverse the hash function and restore the password from the hash. With a good hashing algorithm, it’s not possible to recover the password, but nothing can be done against trying out various inputs to see if they yield the same result. If such a match is found, the password is recovered from the hash.

Password crackingcrackercrackerhashhashPa$$w0Rd1990db21d7...Pa$$w0Rd1991247cb4...Pa$$w0Rd19925e8539...

And choosing a good algorithm makes a difference here. SHA-1 was designed for speed, which helps the cracking process. Bcrypt, Scrypt, and Argon2 were designed to be costly in various ways to make cracking as slow as possible, especially on dedicated hardware. And the difference is huge.

Considering only the speed, an SHA-1 hashed password that can not be cracked is like this: 0OVTrv62y2dLJahXjd4FVg81.

A safe password using a properly configured Argon2 hash: Pa$$w0Rd1992.

As you can see, choosing the right hashing algorithm can make an otherwise weak password an uncrackable one.

And remember that this is dependent only on how the service you’re registering to is implemented. And you have no way of knowing on which part of the spectrum the implementation stands. You can ask, but chances are they won’t even respond or say that they “take security seriously” or something similar that does not mean anything.

Do you think companies take security seriously and use a good hash algorithm instead of a crappy one? Look at the list of breached databases, especially the hashes that were used. Many of them still used MD5, most used SHA-1, and some used bcrypt. Some even stored passwords in plain text. That’s the reality you should assume.

There is a bias here as we only know what hash was used for breached databases and it’s likely that the companies that use a weak algorithm also fail to safeguard their infrastructure. But just look at that list, I’m sure you’ll find familiar names you wouldn’t have thought of having weak security. Just because a company is looking big and reputable does not mean they will do the right thing.

You choose the password

Where does that leave you as a user?

With plain text passwords, you can not do anything. If the database is gone your password strength does not matter.

With properly configured algorithms it also does not matter much how secure your password is, not considering trivial cases like 12345 and asdf.

But between them, especially with SHA-1, your choice matters. The hashing function is not suited for passwords in general, but if you use a secure password you can make up for the shortcomings of the algo.

Hash algo         asdf             AJnseykp         8VjB2qwD7eN3eG4Fjkfeks
None - - -
MD5 - -
SHA-1 - -
Bcrypt - ✓*
Scrypt - ✓*
Argon2 - ✓*

 

* It depends on the configuration. These hashes have various moving pieces that affect their strength, but when configured properly they can thwart cracking attempts.

The bottom line: if you use a strong password then you are protected in more breaches than with a weak one. And since you don’t know how secure the password storage is, you can not be sure what is “secure enough” for a given service. So assume the worst where your decision of the password still makes a difference.

Unique passwords are not enough

OK, but why should you care if you use a password manager and generate a unique password for every site? In this case, you are not vulnerable to credential stuffing which is when a known email/password pair is checked on other services in the hope that they are reused. And since password reuse is one of the biggest problems, this is a serious threat.

Credential stuffingHacked serviceHacked servicehackerhackerserviceserviceemail1/password1email2/password2login: email1/password1Deniedlogin: email2/password2OK

But generating a new password for every site protects from this. And a database is stolen, everything inside it is known to the hackers, why should you still protect the password?

The problem is when you don’t know that the database is breached and you continue to use the service. In this case, hackers have access to all your future activity on that site. You might add a credit card later and they still know about it. A strong password means they can’t log in with your credentials and can not compromise your future activity.

Usage after breachuseruserserviceservicehackerhackerregisteremail/passwordhackdatabasecrack passwordloginemail/passwordaddsensitive infologinemail/passwordsensitive info

How to measure the password strength with entropy

Password strength is all about entropy, which is a numerical representation of how much randomness it contains. As we are working with large numbers, so instead of saying there are 1,099,511,627,776 (2^40) different variations, it’s easier to say that it has 40 bits of entropy. And as password cracking is all about the number of variations as the more there is the more time it takes to try out all the possibilities.

For random characters generated by password managers entropy is easy to calculate: log2(<number of different characters> ^ <length>).

The length is trivial, but what are the number of different characters? It depends on the character classes a password has.

Character class         example         number of characters
Only lowercase letters abc 26
+ uppercase letters aBc 52
+ numbers aBc1 62
+ special characters aB?c1 84

 

As an example, a password of length 10 containing a random mix of lower and uppercase letters has log2(52 ^ 10) = 57 bits of entropy.

The above math expression can be simplified to see how much entropy a single character of a given class brings to the overall strength using the expression log2(n ^ m) = m * log2(n). This yields: <length> * log2(<number of different characters>), where the second part is the entropy per character. The above table, using this formula:

Character class         example         entropy / character (bits)
Only lowercase letters abc 4.7
+ uppercase letters aBc 5.7
+ numbers aBc1 5.95
+ special characters aB?c1 6.4

 

To calculate the strength of a password, consider the character classes it is made of, get the entropy number from the table and multiply by the length. The example above (lower + uppercase letters of length 10) yields 5.7 * 10 = 57 bits. But if you increase the length to 14, the entropy jumps to 79.8 bits. But if you keep the length at 10 but add numbers and special characters, the total entropy will be 64 bits.

The above expression offers a quick way to calculate how much entropy a password has, but comes with a caveat. It only applies when the characters are independent of each other, which is only true for generated passwords.

The password H8QavhV2gu satisfies this criterion, so it has 57 bits of entropy.

But ones that are easier to remember, such as Pa$$word11, while having the same length and more character classes has a lot less entropy. A cracker does not need to try all the combinations only the words from a dictionary with some transformations.

Therefore any calculation based on multiplying the length with the entropy of the character class is only valid for generated passwords.

Guidelines for entropy

The more entropy a password has the harder it is to crack it, but what is enough? The general wisdom is that ~16 characters should be more than enough for a password, which yields between 95 - 102 bits, depending on whether special characters are included or not. But what is the threshold? 80 bits? 60 bits? Or even using 102 bits is too low?

There is another algorithm that is similar to a bad password hashing algorithm in terms of speed but is way better studied: the AES encryption.

This is used to encrypt everything secret in all sorts of government and military institutions and therefore its strength is well considered. And it’s fast, so if a key with a specific amount of entropy can not be cracked for AES then it will be good for a password with a bad (but not broken) hash.

The NIST (National Institute of Standards and Technology) is the entity that defines which key sizes are good for the foreseeable future. Their recommendation is AES-128 for “2019 - 2030 & beyond”. Which, as the name implies, has 128 bits of entropy.

Another recommendation specifically for key sizes is to use at least 112 bits of entropy:

For the Federal Government, a security strength of at least 112 bits is required at this time for applying cryptographic protection (e.g., for encrypting or signing data).

To get 128 bits of entropy using lower and uppercase letters and numbers, a length of 22 (5.95 * 22 = 131 bits) is needed.

Other considerations

Why no special characters?

I tend not to use special characters because they break word boundary. This means selecting them requires 3 clicks instead of 2 and that can produce errors where I accidentally don’t paste part of the password to the input field.

With just characters and numbers, a double-click always selects the whole password.

What if there is a maximum length?

Some sites define the maximum password length which prevents you from using 22 characters. There are cases where it goes to extreme lengths, like requiring exactly 5 digits.

In this case, use the maximum length available, there is little else you can do.

There are also recommendations for the service how to handle passwords and limiting the length is clearly against them. The NIST says:

Allow at least 64 characters in length to support the use of passphrases. Encourage users to make memorized secrets as lengthy as they want, using any characters they like (including spaces), thus aiding memorization.

And remember that the service can store passwords in a way ranging from terrible to superb and they won’t tell you exactly how they do this? A short maximum password length gives the impression that they are on the worse end of that spectrum.

Conclusion

Strong passwords are needed even if you don’t reuse them. Strongness is measured in entropy and you should aim for 128 bits of it. A lowercase + uppercase + numbers password with a length of 22 will be above this threshold.

This should protect you in case of a data breach where the provider uses a weak but unbroken hashing algorithm.

26 May 2020