User Account Security: Strong Passwords

The Goal

The aim of this knowledge base entry is to give guidance on how companies can ensure that their processes and systems allow and encourage users to select strong passwords. Whether you’re looking to write a password policy document, or looking to enforce a policy within a system or application, this entry should give you a good overview of what makes a secure password.

It’s also important to understand that reasonable adults will disagree and so we’ve tried to include as much of the “why” behind this guidance as possible.

Key Takeaways

Understand the different approaches to selecting passwords and their trade offs
Understand how enforcing password rotation can lead to weaker user passwords
Understand that enforcing password complexity is unlikely to result in a significant increase in password strength
Understand how badly implemented password strength meters can unintentionally lead users to select weaker passwords

The Trade Off

With many fundamental aspects of security, there is an unavoidable trade off between security and usability. For example, longer and more random passwords are likely to be more secure, but we have to balance that with the user’s ability to both remember and physically enter the password on different systems.

What are we protecting against?

It’s critical when designing any secure system to determine, and keep in mind, exactly what it is that you’re protecting against. In this instance, we’re looking to protect users from two separate attacks:

Online password guessing – where the attacker is attempting to guess the user’s password directly against the login system itself. In this instance there are multiple protection mechanisms in play, including the potential for the system to employ both attempt rate limiting and account lockout protections to prevent a successful attack.
Offline password cracking – where the attacker has captured a cryptographically protected password (such as via a network attack such as LLMNR spoofing, or by extracting a password hash from a compromised authentication system). In this case, account lockout is not going to offer any protection at all and it’s likely that the number of attempts per second an attacker can perform will be significant.

It’s also worth noting that we should be attempting to address many years of poor security advice that users have been given (both in good faith and in ignorance), as well as the fact that users will often look for the path of least resistance when utilising systems and therefore many bad habits may have been reinforced previously. Some users may choose known weak passwords such as “Password1” because they are simply lazy, however other users may choose passwords like “Password1” because many of the systems they have interacted with over the years have applied basic rules to passwords in order to grade them – such as: “If the password is over 8 characters long and includes a mixed character set then consider it strong” and this has led the user to believe, incorrectly, that they were making a good security choice.

For example, here is a screenshot from Microsoft 365 Admin centre, grading a known insecure and commonly used password as “Strong”, but marking a 56 character passphrase as weak.

I believe that repeated exposure to incorrect ratings like this can mislead users, reinforce bad practices, and cause them to choose weaker passwords whilst thinking they’re more secure. In this case, simply adding a “1” to the end of the passphrase would cause it to jump from “Weak” to “Strong”.

The Problem With Complexity

To prevent users from selecting weak passwords, such as those based on dictionary words or names, some organisations enforce “Complexity”. Generally speaking, this refers to enforcing certain rules that a password must meet in order to be accepted. A common example of a complexity ruleset is Microsoft’s, which requires three of the following:

Uppercase letters
Lowercase letters
Numbers
Symbols

With this ruleset, basic dictionary words such as “password” or “misguided” would be blocked. The intention is likely to cause a user to select a password with higher entropy, such as: “sWNrw9LE”. Whilst it is true that this password would be much harder to guess than the word “password”, this ruleset would also allow “Password1” – which would not be difficult to guess.

Both NIST and the NCSC recommend against enforcing complexity, with the NCSC bluntly stating:

“Do not use complexity requirements”

NIST also recommends against enforcing complexity within SP 800-63B, but they put it in a slightly less pithy way:

“If the CSP or verifier disallows a chosen memorized secret based on its appearance on a blacklist of compromised values, the subscriber SHALL be required to choose a different memorized secret. No other complexity requirements for memorized secrets SHOULD be imposed.”

NIST also goes on to add:

“Highly complex memorized secrets introduce a new potential vulnerability: they are less likely to be memorable, and it is more likely that they will be written down or stored electronically in an unsafe manner.”

Again, highlighting that we’re balancing security with usability and very long and very random passwords are likely to push users towards bad practices such as writing passwords down.

As stated previously, when enforcing restrictions in authentication systems it is important to bear in mind what risk you are attempting to address with the requirement. Whilst the intention of complexity is likely the believe that it causes users to choose strong passwords, we recommend against enforcing complexity because it is an ineffective method of enforcing secure passwords.

The Problem With Rotation

Another common issue we see with authentication systems is “Password Rotation”, that is the requirement for a user to change their password after an arbitrary period of time. For example, every 90 days. Again, the NCSC is pretty blunt here:

“Don’t enforce regular password expiry”

NIST has a similar recommendation in SP 800-63B:

“Verifiers SHOULD NOT require memorized secrets to be changed arbitrarily (e.g., periodically).”

The risk that this policy may be trying to address is the potential for an attacker to have compromised a password is some way but is yet to use the password. The hope being that the rotation policy will require the user to change their password before an attacker uses the compromised password.

Two real world examples that I come across regularly during penetration tests are:

An attacker acquires a password hash but “cracking” the hash will take time
A user writes a password down in a physical or digital file and an attacker comes across the file at a later date.

In the first example, the hope of rotation is that the password would “expire”, requiring the user to select a new password, before the attacker is able to crack the hash. However, it is often the case that when required to select a new password the user will likely choose a password that is only a minor variation of the old password. For example going from “Summer 2024” to “Autumn2024”. In these cases, the rotation is likely ineffective at preventing the breach.

Additionally, the requirement to regularly change passwords puts an additional burden on users, and where minor variations are not used, may lead users to become more likely to use simpler, easy to remember passwords – and worse, to write those passwords down and store them insecurely.

Additionally, in many cases stolen passwords are exploited very quickly, either immediately or certainly much shorter than the expiry period, so rotation does little to actually protect against the cases given above.

The following screenshot shows a user’s password history from a recent penetration test, showing that the user has changed their password 16 times and selected passwords that are almost identical each time – in a predictable pattern over time. It would not be difficult to the password this user is likely going to use next:

As rotation increases the burden on users, making them more likely to write passwords down and store them insecurely and more likely to select weaker passwords or passwords that are only a minor of the previous password, we do not recommend that password rotation is enforced.

Minimum Password Length

During penetration tests, the most common minimum password length I come across is 8 characters. However, if I am able to extract hashes NTLM from a compromised machine, then cracking passwords this short is very simple.

For example, a modest GPU such as the Nvidia GeForce RTX 3070 is able to break the entire key space for all possible 8 character password in approximately 2.5 days.

There’s a fair criticism here to say that – if an attacker gets to the point of being able to extract hashes from a system is it not already “game over” for the system? However, it remains true that many users (and even system administrators) reuse passwords between devices, domains, and websites. Therefore, I would recommend that every effort be made to prevent password hash cracking where possible.

When it comes to fully random passwords, increasing the length of the password exponentially increases the number of possible combinations. Whilst I can crack the full key space for 8 characters in 2.5 days, for 9 characters it would take approximately 235 days. (However, you should note that many GPUs much faster than the 3070 exist and so do multiple GPU machines). As such, we recommend a minimum password length of 12 characters.

Maximum Password Length

If you set a short maximum password length, such as 15 characters, this will do little to improve security but will likely prevent users from selecting strong passphrases.

However, you may find there are technical limitations that require you to set a maximum password length. One key example is that most implementations of the bcrypt password hashing algorithm has a maximum input length of 72 bytes. Another consideration is that it’s possible to perform a denial-of-service attack against some systems by sending a very large password (such as one that is 1 million characters long), as the system will potentially consume a large amount of CPU or memory when attempting to hash the supplied password to compare it to the one in storage.

Therefore, to prevent denial of service attacks, you nay be required to set a maximum password length, but to ensure that this length limitation does not prevent users from selecting long passphrases it should not be set too short. Generally, it should be set to at least 64 characters.

Password Deny Lists

It’s a good practice to review user passwords against a list of known weak, easily guessed, and previously compromised passwords. Ideally this should be done at the point that a user selects one of these passwords and at that point the user should be required to select a different password. However, if this is not technically possible then regularly reviewing user password hashes against these restrictions is an alternative.

There are off the shelf solutions to assist with this, however you should also consider adding context-specific passwords to the list. Such as preventing users from selecting passwords based on the company name, the office location, or other contextual information.

Additionally, should a user’s password be known or suspected to have been compromised, then a password change should be enforced. This is distinct from password rotation that we mentioned earlier, where a password change is based on an arbitrary period of time.

Password Strength Meters

I mentioned previously one of the biggest weaknesses with password strength meters is, that if they are badly implemented, they may give guidance that is actively harmful. That said, they can be useful if implemented well.

It is likely easier to create a system that simply informs users when bad decisions are made (such as basing passwords on a single dictionary word, or using common weak suffixes like 1234), rather than scoring a password on a scale and never unintentionally scoring a weak password as strong.

Three Random Words

The NCSC recommends that users consider creating passwords based on “Three Random Words” such as batteryhorsestaple. One of the main reasons they state this is that these passwords will be “strong enough” but easier to remember than fully random passwords.

During penetration tests we often see weak passwords based on dictionary words with common suffixes (such as Welcome1, Password123, or Summer2024) or based on letter substitutions (such as P@55w0rd, 3l3ph@nt, or L0ckedD0wn). These passwords can often be easily guessed and can almost always be cracked with ease.

In many instances, it is true that a Three Random Words password would be more secure that passwords of this nature that we commonly see, and therefore overall a security improvement, but are they really “strong enough”?

A significant factor in the security of these passwords is resting on the word “random” in the Three Random Words. For example, it’s fairly obvious that passwords such as CatDogFish and OneTwoThree are unlikely to offer a good level of security. To quantify this, we could use the size of the dictionary the user is drawing from to determine how many possible passwords there would be.

For example, if the user limits their word choice to only the top 5000 most common English words, then these passwords have at most 5000^3 possible combinations: that’s 125,000,000,000 possible passwords. That sounds like a lot until you read our Hashcracking on AWS article, where we show that in some instances attackers can perform more than 680,000,000,000 per second.

If you expand the list of words used from the most common 5000 words, to include all dictionary words then you’re likely looking at more than 175,000 possible base words. This gives far more possible combinations (approximately 5,359,375,000,000,000). On our cracking system this would push out the time required to attempt all passwords from 1 second to just over 5 days. This is a huge improvement, but the passwords that your generating may look less like CorrectBatteryHorse and more like AbsentmindednessMicrocirculationFactitiousness. So, whilst we’re improving the security, we’re causing a strong negative impact on the usability again.

An alternative to increasing the dictionary length, is simply to expand to four random words instead of three. Whilst when using that 1750,000 word list with three words, our cracking rig can get through all possible combinations in just over 5 days, if you expand this to four random words then this extends to just over 2,400 years.

It is true, that choosing very long and completely random passwords with a broad character set can achieve similar or better bruteforce protection – but they are harder to remember. It is also true that Password Managers make the task of remembering passwords much easier.

However, the balance of security offered and simple nature of the “three random words” format makes it easy to get the message across, even to non-technical team members, therefore it is our recommended approach for most situations. Where a higher level of security is required then four random words can be used. Although care needs to be taken with the word “random” in all cases.

Read More