With all the recent (e.g. LinkedIn) discussions of passwords I'm looking at password hashing implementations. After two cups of coffee and a morning reading I'm no more a cryptographer than when I started. And I really don't want to pretend that I am.
Specific Questions
Does using a integer unique user ID fail as an effective salt? (crypt() uses only 16 bits?)
If I simply run sha256() on a hash over and over until a second is used up does that defeat the brute-force attacks?
If I have to ask these questions should I be using bcrypt?
Discussion/Explanation:
The goal is simply if my user's hashed passwords were leaked they:
- would not be "easy" to crack,
- cracking one password would not expose other users that use the same password).
What I've read for #1 is the the hash computation must be expensive -- taking, say, a second or two to calculate and maybe requiring a bit or memory (to thwart hardware decryption).
bcrypt has this built in, and scrypt, if I understand correctly, is more future-proof and includes a minimum memory usage requirement.
But, is it an equally effective approach to eat time by "rehashing" the result of sha256() as many times as needed to use up a few seconds and then store the final loop count with the hash for later checking a provided password?
For #2, using a unique salt for every password is important. What's not been clear is how random (or large) the salt must be. If the goal is to avoid everyone that uses "mypassword" as their password from having the same hash is it not enough to simply do this?:
hash = sha256_hex( unique_user_id + user_supplied_password );
or even this, although I'm not sure it buys me anything:
hash = sha256_hex( sha256( unique_user_id ) + user_supplied_password );
The only benefit I can see from using the user's ID, besides I know it is unique, is avoiding having to save the salt along with the hash. Not much of an advantage. Is there a real problem with using a user's ID as the salt? Does it not accomplish #2?
I assume if someone can steal my user's hashed passwords then I must assume they can get whatever they want -- including the source code that generates the hash. So, is there any benefit to adding an extra random string (the same string) to the password before hashing? That is:
# app_wide_string = one-time generated, random 64 7-bit *character* string.
hash = sha256_hex( unique_user_id + app_wide_string + user_supplied_password );
I have seen that suggested, but I don't understand what I gain from that over the per-user salt. If someone wanted to brute-force the attack they would know that "app_wide_string" and use that when running their dictionary attack, right?
Is there a good reason to use bcrypt over rolling my own as described above? Maybe the fact that I'm asking these questions is reason enough?
BTW -- I just timed an existing hashing function I have and on my laptop and I can generate about 7000 hashes a second. Not quite the one or two seconds that are often suggested.
Some related links:
using sha256 as hashing and salting with user's ID
SHA512 vs. Blowfish and Bcrypt
What is the optimal length for user password salt?