What's a good password in 2018?
Public domain by Carlos ZGZ https://flic.kr/p/ExnsK6

What's a good password in 2018?

Executive summary

There's no such a thing as a good password so let's say a not bad password.

A not bad password is a random string at the very least 10 characters long. And don’t give a damn to password policies if you can!

How to gauge a password?

If one wants to assess passwords rigorously, there are two things to do: choose a (finite) set of possible passwords and stick it a testable property. The set determines what kind of passwords you can build, the test allows comparison between passwords. 

To define the passwords set, one needs to know its dictionary (the finite charset) and a given password length. Let’s name E such a set and card(E) its cardinality. Bitwise, the passwords length Lmax is: ceiling(log2(card(E))).

As for the measure, we can leverage Kolmogorov complexity, preferably the prefix-free flavor [3] as it is more natural to manipulate. Let’s refer it as C(.)

The test, based on C(.), will be explained later on.

Making useful passwords policies

If passwords are chosen at random within E, then only the two criteria mentioned above are necessary and sufficient to define a policy. But which one is more important: the dictionary, or the length? Obviously, both criteria are subject to strong physical constraints: 

Dictionary: if we want end users to type in their passwords on any keyboard regardless of the country or language, the number of possible keys must be relatively small.

Length: passwords of length 4 are pretty useless while passwords of length 20 are unpractical.  

So let’s figure this out with the help of a few practical, real-life cases: 

  • Large dictionary, long passwords: alphanumerical (upper and lowercase) plus 20 special characters, password length = 10
  • Large dictionary, short passwords: alphanumerical (upper and lowercase) plus 20 special characters, password length = 8
  • Small dictionary, long passwords: alphanumerical (upper and lowercase), password length = 10
  • Small dictionary, short passwords: alphanumerical (upper and lowercase), password length = 8


Given that the maximum complexity of a string of binary length l taken from E is equal to Lmax (because C(s|E)<=log2(card(E))+O(1)), what is the maximum complexity in the four situations described above?

  • Large dictionary, long passwords: 64 bits
  • Large dictionary, short passwords: 51 bits
  • Small dictionary, long passwords: 60 bits
  • Small dictionary, short passwords: 48 bits

We see that the size of the dictionary, when chosen from a keyboard-friendly range, has little impact on complexity. The most important factor is password length.

Now that we now our play area (long passwords more-or-less regardless of the dictionary), let’s dive into the details.


How long is long enough?

To answer this question, let’s try to give sense to password weakness.

A password p is said to be bruteforce-weak if it can be bruteforced in polynomial time using state of the art technology. What is state of the art? We need to pick a reference point. 

For example [1]: in 2012, a password of 7 characters taken from a dictionary of size 95 was said to be weak. Its length is 7*log2(95)=46 bits

If you are not happy with that, you may switch to another more accurate benchmark and make your own calculations.

What is a bruteforce-weak password in 2018?

If computation power stills follows Moore Law, computation power has been (very roughly) multiplied by 16 since our reference point in 2012.

Also if we agree that the gain in bits G agrees sublinearily with computation power, as suggested by [1], then G <= O(log2(16)) = 4 bits

So the bitwise length of a weak password in 2018 is 46 (bitwise length in 2012) + 4 (gain) = 50. This accounts to a password of length 7.6 if we keep the same charset.


A password should be at least 8 characters long not to be bruteforce-weak.

That’s not enough, though!

How random is random ?

If d() is the random deficiency [2] of a string s in E, let’s define d(s|E)=log2(card(E)) – C(s|E)

This will be our bitwise test to know if a password p is random since d(p|E)=0 when its Kolmogorov complexity is maxed out. This test is not computable, which is strange for a test, but you will see that it will do.

The probability that d(p|E)>= n bits is equal to 1/(2^n) where n<log2(card(E))

If you want a proof: There are 2^Lmax passwords in E, but there are only (2^Lmax)-1 strings of length < Lmax (there are 2 strings of length 1, 4 of length two, … (2^(Lmax-1) of length Lmax-1).

So there is one password in the set that cannot be written by a shorter representation (ie, it cannot be compressed).

Likewise, there are only 2^(Lmax-1)-1 strings of length < Lmax-1. Since 2^(Lmax-1)=0.5*2^Lmax, half of the passwords of length>=Lmax-1 cannot be compressed. 

And so on: 

? a quarter of the passwords of length>=Lmax-2 can be compressed

? (1/2^(n-1)) of passwords of length>=Lmax-(n-1) can be compressed. In E, this means card(E)/2^(n-1) strings. 

For these strings, the complexity is at most equal to log2(card(E/2^(n-1))+O(1) as already seen

But log2(card(E/(2^(n-1)))=log2(card(E))-log2(2^(n-1))

So for card(E)/2^(n-1) strings,

d(p|E)>log2(card(E)) - (log2(card(E))-log2(2^(n-1)))

 that is: d(p|E)>log2(2^(n-1))

 that is: d(p|E)>n-1


A password p in E is said to be c-weak if d(p|E)=c

Let’s define E as the set of passwords with 7.6 characters length (or 50 bits) or less and a dictionary size of 95 items. We’ve seen that, as of 2018, all elements in E are bruteforce-weak.

Now let’s define a new set E’ bigger than E where all bruteforce-weak passwords from E are compressed. 

In E’:

? Half of the passwords are at least 1-weak by definition of d(.|.)

? This includes all bruteforce-weak passwords by construction of E’

How is E’ constructed? By picking the same dictionary as E, increasing the password length, and making a compression map. What password length will we choose for E’?

8 characters, or 52.5 bits : the bruteforce-weak passwords (of length 50 bits or less) represent approximatively 35.4% of the set

9 characters, or 59.1 bits: there are still 0.4% bruteforce-weak passwords

10 characters, or 65.7 bits: they dwindle to 0.004%

Conclusion

See exec sum ;)

References

[1] Errata Security analysis reported in Ars Technica : https://arstechnica.com/security/2012/08/passwords-under-assault/4/

[2] Ming Li et Paul Vitanyi, https://books.google.fr/books?isbn=1475738609, page 100

[3] Introduction by Lance Fortnow : https://people.cs.uchicago.edu/~fortnow/papers/kaikoura.pdf, or reference [2]




Greg Holmsen

The Philippines Recruitment Company - ? HD & LV Mechanic ? Welder ? Metal Fabricator ? Fitter ? CNC Machinist ? Engineers ? Agriculture Worker ? Plant Operator ? Truck Driver ? Driller ? Linesman ? Riggers and Dogging

6 年

This is exactly what I wanted to read about today! I agree with your point of view on strong passwords.

回复

要查看或添加评论,请登录

Christophe Parisel的更多文章

  • Adversarial lateral motion in Azure PaaS: are we prepared?

    Adversarial lateral motion in Azure PaaS: are we prepared?

    Lateral motion techniques are evolving in PaaS, and we should be worried. Let's discuss a risk confinement approach.

    19 条评论
  • How will Microsoft Majorana quantum chip ??compute??, exactly?

    How will Microsoft Majorana quantum chip ??compute??, exactly?

    During the 2020 COVID lockdown, I investigated braid theory in the hope it would help me on some research I was…

    16 条评论
  • Zero-shot attack against multimodal AI (Part 2)

    Zero-shot attack against multimodal AI (Part 2)

    In part 1, I showcased how AI applications could be affected by a new kind of AI-driven attack: Mystic Square. In the…

    6 条评论
  • Zero-shot attack against multimodal AI (Part 1)

    Zero-shot attack against multimodal AI (Part 1)

    The arrow is on fire, ready to strike its target from two miles away..

    11 条评论
  • 2015-2025: a decade of preventive Cloud security!

    2015-2025: a decade of preventive Cloud security!

    Since its birth in 2015, preventive Cloud security has proven a formidable achievement. By raising the security bar of…

    11 条评论
  • Exploiting Azure AI DocIntel for ID spoofing

    Exploiting Azure AI DocIntel for ID spoofing

    Sensitive transactions execution often requires to show proofs of ID and proofs of ownership: this requirements is…

    10 条评论
  • How I trained an AI model for nefarious purposes!

    How I trained an AI model for nefarious purposes!

    The previous episode prepared ground for today’s task: we walked through the foundations of AI curiosity. As we've…

    19 条评论
  • AI curiosity

    AI curiosity

    The incuriosity of genAI is an understatement. When chatGPT became popular in early 2023, it was even more striking…

    3 条评论
  • The nested cloud

    The nested cloud

    Now is the perfect time to approach Cloud security through the interplay between data planes and control planes—a…

    8 条评论
  • Overcoming the security challenge of Text-To-Action

    Overcoming the security challenge of Text-To-Action

    LLM's Text-To-Action (T2A) is one of the most anticipated features of 2025: it is expected to unleash a new cycle of…

    19 条评论

社区洞察

其他会员也浏览了