A friend of mine and me are having a discussion about whether we should pre-hash the passwords of the users of our webapp before sending it to our servers.
I know that there are multiple questions that already handle this topic but they're all about transferring it securely to the server. Our idea is not about the transfer security (we use SSL) we want to hash clientside to prevent that the "real" passwords reach our server.
The idea came as Twitter announced their bug that caused passwords to be printed to a logfile in cleartext.
We are currently discussing about whether this concept makes sense or not and how it affects the security of a password (in terms of Bruteforce) if we would hash it with SHA512.
TL;DR:
We want to hash passwords clientside to prevent our servers from getting them in cleartext (we use SSL for transfer).
Does this make any sense?
What algorithm would be best to use for hashing?
The hashed passwords would then serverside be hashed again with bCrypt.
It 100% makes sense: in fact, the concept has been proposed by a number of people, but the difficulty is in implementing correctly. There are a number of pitfalls if you do it wrong, the most direct one is being vulnerable to "pass-the-hash" as @swa66 describes. To prevent that, you need to hash on both sides. The client-side hash should be slow (bcrypt, scrypt, argon2, or pbkdf2) whereas the server side hash should be fast (sha256).
EDIT: A number of people have down-voted this without understanding how this works, so I now include the basic details here (previously I only linked to how this works). The idea is to apply a slow hash such as bcrypt on the client side, and then a fast hash such as SHA256 on the server side. The fast hash is required to prevent pass-the-hash attacks. In the event of the database leak, an attacker either hash to invert the fast hash (impossible -- violates the one-way property of a cryptographic hash function), or brute force the preimage to the fast hash (impossible -- the size is the length of the output from the slow hash, for example 184-bits for bcrypt), or brute force the combination of the slow hash and the fast hash -- which puts the attacker back at the same position as if the entire computation had happened server side. So we have not reduced the security of password attacks in the event of a database leak by shifting the heavy computation to the client side.
I've surveyed a number of proposals like this in Method to protect passwords in databases for web applications. Additionally, I analyse the pros and cons and identify weaknesses that have not been identified before (account enumeration), and propose a unique way of doing this securely. The research is built off a number of sources, including:
- Secure authentication: partial client-side key stretching… please review/criticize my idea
- How to securely hash passwords? -- see section on Client Side Hashing
- Client side password hashing
- Discussion from various authors on Hacker News -- see comments from oleganza, mschuster91, crusso, etc...
You cite the Twitter example, and GitHub did similarly. When I wrote the paper above, the most prominent example for preventing a server from seeing the clear text passwords was Heartbleed, which I comment on in the paper (bottom of Section 1.3).
There has been subsequent follow up research by others identifying similar ideas -- Example: Client-Plus-Server Password Hashing as a Potential Way to Improve Security Against Brute Force Attacks without Overloading the Server. No one person deserves all the credit, but the main takeaway is yes it is a good idea if you do it securely, but you really need to understand the risks (it is easy to do insecurely if you have not read the research).
NO!
Rule one in cryptography: do not invent it yourself, you'll make horrible mistakes.
It's not against you personally, by far not: even top notch experts make mistakes when designing with great care new systems. That's why they peer-review each-other's work multiple times before anything become a standard. Many proposals for such standards by such experts get redrawn due to problems detected during such a peer-review. So why can't the rest of us mere mortals design: there's nobody good enough to do the peer-review as the experts will not touch it.
Hashing the password client side
Hashing client side is really bad as the hash becomes the password, and now you store it on the server in the clear.
How to do passwords
- Only store hashed passwords (implied: send the password to the server, just do not store it)
- use a salt and store it with the password (unencrypted). The salt is essentially a random string that you concatenate to the pasword before you hash it (to store it , and to verify it)
Use a SLOW hash. Using a fast hash is a common and fatal mistake, even when using salts. Most hash functions people know like SHA-256, SHA-3 etc. are fast hashes and completely unsuitable for hashing short, predictable items like passwords as they can be reversed in a surprising short time.
How slow: as slow as you can afford. Examples of slow hashes:
bcrypt, PBKDF-2 (which is essentially a high number of rounds of a
fast hash to make it slow)
There are -depending on your programming environment- pre-made routines, use them!
Ref:
- https://crypto.stackexchange.com/questions/24/what-makes-a-hash-function-good-for-password-hashing
- https://crypto.stackexchange.com/questions/59797/authorities-on-password-hashing-best-practice
While @swa66 outlined how to manage passwords securely, let me note that there is a valid scenario where you can consider client-side password hashing, so don't just blindly follow "best practice", try and understand it first.
Let's say I have a standard web application that stores data from users. In my threat model, I don't even want my own users to have to trust me, or in other words, I want my users' data to be secure even in case of a full compromise of my servers. Therefore, I let them choose a password, and encrypt their data on the client, before sending it to the application. They can retrieve their encrypted data with their user id. Well, that doesn't sound very secure, I can just download anybody's encrypted data and run offline attacks against it. So let's have them access their encrypted data with their password (I don't want them to have to remember two different passwords). But that's not good, because I have their password then to decrypt their data. So one simple solution is to encrypt their data with their password, and send it to the server along with their hashed password, which as it's correctly noted in the answer is the new password as far as the server is concerned (so the server should store it hashed once again and so on). However, the server has no way to decrypt client data, because it never has the original password, yet only the valid person can download even their encrypted stuff, and they only have to remember one password. (Note that this is a very much simplified model, in reality, much more is needed, like for example a proper key derivation function, not just plain hashes, but that's another, much longer story.)
Don't get me wrong, I'm not saying you should normally be hashing passwords on the client - no, the other answer is the correct one in that regard. I just wanted to show that there is at least one use-case where client-side password hashing is a valid option. See well-known password managers, some work similarly.