Authentication is Easy

Authentication (from Greek: α?θεντικ?? authentikos, "real, genuine", from α?θ?ντη? authentes, "author") is the act of proving an assertion, such as the identity of a computer system user. – Wikipedia

No matter what type of system you're building, if you need to offer different experiences for different users, you have to verify each user's identity. This process is known as authentication. The most basic form of authentication is when the user provides a password, they previously shared with the system they are trying to connect to. This means that to authenticate the user, we simply compare the password provided by the user with the one stored in the system. After all, it's just comparing two strings—how difficult could that be??


Beginning

As we said, it is easy, right? So let’s implement it, I will use my imaginary language, which has a simple syntax and all required libraries built in.?

def authentication (username, password):
    storedPassword= dbHelper.findUserPassword(username)
    if storedPassword== NOT_FOUND:
        return false
    else:
        valid = compare(password, storedPassword) 
        return valid        

Unfortunately, there are a couple of problems with storing passwords in this way—it means your password is stored in a human-readable format, so anyone with access to the database could potentially read users' passwords. Let’s improve that.

Store Passwords Securely

How do you securely store passwords so that nobody can access your users' credentials, if someone with malicious intent obtains the content of your database? No, if you’re thinking about using Base64, and I know some people will STOP IT NOW. Base64 is just a way to encode data, you can compare it to converting a message from English to French. If you don’t know French, it might sound like a good way to hide secrets It is… as long as you encounter someone who speaks French… or knows how to use Google. So imagine this is my secret password encoded as Base64 U3VwZXJTZWNyZXRQYXNzd29yZA==. If you visit any website that offers Base64 encoding or decoding, the process will take only a fraction of a second.

.Another thing some people think about is encrypting passwords.

Encryption changes a message into a coded format through a complex mathematical process. An encryption key is then used to create ciphertext, ensuring the message remains protected from anyone without the key. It sounds like the solution that we need. Can you think of the drawbacks of this solution??

First, to compare encrypted passwords, you would need to decrypt it. To do that, you must provide a secure mechanism to store the decryption key. Yes, you can say I will encrypt it, but then you will have the chicken-egg situation, how do you secure that key? In theory, you could find some smart ways to do this, but this would be a quite complex solution.

There could be another way, some of the encryption algorithms, like for example RSA for the same input will generate the same output. We could leverage that property and, instead of decrypting the password stored in the database, try encrypting the one provided by the user and comparing it to the stored password.?

This solution has some drawbacks, but it could work. However, I would ask: why would we do it this way when there's an alternative specifically designed for this purpose—hashing?

Hash

Hash is a function that converts any string to a fixed length value, and it is not reversible. It is like a modulo operation in mathematics. Modulo is a mathematical operation that returns the remainder from dividing two numbers by each other.

?Imagine you have a number 42 and you want to divide by 5, 42 mod 5 is equal to 2. Now if you know the result and divisor you won't be able to work out the original value (dividend), as it is one of an infinite about of values.

The difference between the modulo function and the hash function is hash is “slightly” bit more complex. So let’s hash our password before we use it. If you are a Java programmer and you are thinking about using object::hashCode, then… do it and send me the link to your system. We need to use a cryptographic hash, cryptographic hash will provide fixed length output and size depends on the algorithm that we choose.

So let’s redefine our authentication function to look like this:

def authentication(username, password):
    storedPassword= dbHelper.findUserPassword(username)
    hashedPassword? = Hash(password)
    if storedPassword== NOT_FOUND: 
        return false
    else:
        valid = compare(hashedPassword, storedPassword)
        return valid        

But which hashing algorithm should we use? There are many to choose from: Argon2, SHA3, MD5, and others.

The recommendation from industry experts is to use bcrypt, Argon2, or PBKDF2. MDx hashes have a high probability of collisions (which means two or more distinct inputs might generate the same output). With the SHA family, there’s a different issue—it’s quick, meaning output is generated very quickly. It might sound good, but it also means there might be a problem if someone tries a brute-force attack. So, we need something with the least probability of collision and relatively slow performance, which is where Argon2, bcrypt, or PBKDF2 come into the picture.

Secure String Comparison

So, let’s say we picked up a good algorithm, and it is implemented as part of the CryptoHash method. Now our code looks like this:

def authorisation(username, password):
    storedPassword= dbHelper.findUserPassword(username)
    hashedPassword? = CryptoHash(password) 
    if storedPassword== NOT_FOUND:
         return false
    else: 
        valid = compare(hashedPassword, storedPassword) 
        return valid        


This implementation is still not perfect—you’re probably wondering why at this moment. Basic string comparison in most languages is implemented to short-circuit, meaning it compares strings only until it finds the first character that doesn’t match. That means the timing response for comparing strings like “aaaaaaaa,” “baaaaaaa,” or “aaaaaaab” will be different. An attacker could use that knowledge to determine consecutive characters of the password hash. Which allows him to reduce the number of trials significantly.

Let’s do some math, let’s for the sake of simplification our calculation we will use SHA1.

SHA1 generates a 20-byte value, which gives 256^20 combinations, which is a very big number. A ball of rice that contained 256^20 rice grains, could have a larger radius than the distance from the sun to Jupiter. Implementing insecure comparison would allow us to reduce this number to 256*20, and if we represent this as quantity of rice grains, it will be a portion of the rice you get in Chinese takeaway.?

The conclusion from that is straightforward, we need to implement a secure comparison method that will compare each character in the string. The pseudo code will look something like this:

def secureCompare(passwordHash, passwordCandidate):
    result = true
 ? ?for i in len(passwordHash): 
        if passwordHash[i] != passwordCandidate[i]:
            result = false
    return result        

And our next iteration will look like this:

def authentication(username, password):
    storedPassword= dbHelper.findUserPassword(username)
    hashedPassword? = CryptoHash(password)
        if storedPassword== NOT_FOUND: 
            return false
        else:
            return secureCompare(hashedPassword, storedPassword)        

But What if There Is No User?

So far, we’ve followed the happy path. There is a user, and we’ve fetched their password and compared it with the one provided by the user. But what if the user doesn’t exist?

If we short-circuit here, this code could be used for account enumeration, allowing an attacker to determine which usernames exist and which don’t. Yes, this might be difficult if the username is a random string, but it’s often an email address that could be easily obtained from various sources, such as by performing an old-fashioned OSINT investigation.

The ideal solution is to have a fake password and non-matching string and run them through the same process. So, let’s add this to our process:

def authentication(username, password):
    storedPassword= dbHelper.findUserPassword(username)
    hashedPassword? = CryptoHash(password)
    if storedPassword== NOT_FOUND: 
        secureCompare(fakePasswordHash, hashedPassword) 
        return false
    return secureCompare(storedPassword, hashedPassword)        

Locking Accounts Now we’ve handled authentication for when the user exists or not, but what if we want to lock an account after N failed authentication attempts?

The first thought would be to extend the user table and add a lock there. Now, our authentication method will look like this:

def authentication(username, password):
    hashedPassword? = CryptoHash(password)
    user = dbHelper.findUser(username)
    if user == NOT_FOUND:
        secureCompare(fakePasswordHash, hashedPassword)
        return false
    if user.failAttempts > MAX_ATTEMPTS:  
        throw “user locked”
    storedPassword = user.password
    if secureCompare(storedPassword, hashedPassword):
       user.lock = 0 dbHelper.update(user)
       return true 
    else:
        user.lock++ dbHelper.update(user)
        return false        

While this solution might look good, there is still a small issue. Since locks are only present for existing accounts, it might give hints to a potential attacker about which accounts exist in the system. There’s also the risk of a denial-of-service attack if user accounts are locked. A better solution would be to create an additional table that holds information about locks and account names. Ideally, short-term locks or progressive delays would slow the attacker without affecting legitimate users.

And beyond…

Of course, there are more things to consider, like "spicing up" passwords by adding salt and pepper. Salt is a random string added to the password before hashing, ensuring that even if two users have the same password, their hashes will be different. Salt is stored along with the hash in the database and helps protect against rainbow table attacks by making the hash non-deterministic. Pepper, on the other hand, is a secret value similarly applied to the password but is kept separate from the database, often stored in application code or an environment variable. This adds another layer of security—if someone compromises the database, they still won't have the pepper.

Another possible option to consider would be to use CAPTCHA, which can slow down brute-force attacks. Captcha was designed to prevent automated tools from accessing websites. This could worked if not for those meddling kids, almost since the captcha was introduced, captcha-bypassing tools or services started to rise, and now with the rise of AI costs and implications related to bypassing captcha are much smaller.?

Summary

Authentication might look straightforward, just checking if the user's password matches the one stored in the system, but as you see there is more to it.?

You need to choose the right algorithm to store passwords in the database, choosing badly might have severe implications on the security of the system.?

Sometimes additional information can be exposed just by having different error responses or just by measuring the time that takes a request to complete.?

Locking users might be good prevention from brute force attacks, but if implemented poorly can prevent users from accessing the system.?

On top of that, some things are not covered in this article, like creating audit trails, logging all fail attempts, or how to handle user sessions and authenticating user requests within that session.?

All those issues, make implementing the authentication process not as simple as it initially seemed. The best way to avoid all of those issues would be to reuse existing solutions that are often built into the frameworks and libraries. Those solutions are often well-tested for and other issues can be encountered during the authentication process

要查看或添加评论,请登录

Wojciech Cichon的更多文章

  • Securing SDLC - Vulnerability in 3rd party components.

    Securing SDLC - Vulnerability in 3rd party components.

    Every year the number of vulnerabilities reported in open source components increases. Unfortunately, software teams…

  • Taking Security Seriously

    Taking Security Seriously

    While I was looking for a solution to a problem on StackOverflow a few days ago, the idea for this article came to me…

社区洞察

其他会员也浏览了