I know, this is a simple question, and maybe it is similar to others, but I couldn't find my answer in those, so I`m asking new one.
I would like to understand the process of checking passwords in databases. I know they use encryption and hashing algorithms, but I can't understand how they can check those stored values and compare them with user`s passwords provided in input fields to verify them?
As I have read here in many answers, for example, you can't detect the MD5 hashed password and those have one direct algorithm, so how database use those hash?
In most cases, passwords are hashed and salted client-side, and then (usually) a
POST
request containing login data is sent to some API.This API then calls SQL (or NoSQL) databases server-side, so the user cannot directly interact with any parts of the database that aren't relevant to what they need. At this point, all you need to do (server-side) is compare the hashed/salted password submitted by the user (client-side) with what is stored in the database. If everything matches, then the login is presumably successful (assuming things like usernames, session tokens, etc. are also valid).
Note: MD5 isn't considered secure anymore, so your best bet is to go for something like
SHA-256
for hashing.In very simplistic terms:
Hashing takes input of any size and generates a "checksum" for it. Whether the input is the letter "a" or the "Complete works of Shakespeare", the resulting hash will be of the same length. For MD5, such a hash can be represented as a string of 32 characters. You can test this using any online MD5 hash genarator.
Notice that
Now instead of storing a password in a database, a hash of that password can be stored. To check the password, the value entered into the input field is simply hashed again. If it generates the same hash as what is stored in the DB, then the password is assumed to be correct, and the actual password will never actually be stored in the database.
I say assumed since there is a theoretical possibility that two different inputs can generate the same hash, although that is highly unlikely in the context of entering a password (for some actual crash-examples, see this question under Crypto.StackExchange).
Rainbow tables and reversing hashes
A rainbow table is basically a list of typical passwords for which the hashes have been calculated. Example using MD5:
You can easily query such lists using any of a number of so-called md5 decryptors (Again, just to be clear: They are not really decrypting anything - just matching a hash to a list of known values).
If someone can access your database and get your list of password hashes, then such a list might be used to infer the passwords for some of your users if
Salting basically means adding some extra piece of information before generating a hash, so that the resulting hash will be less likely to appear in a rainbow list. As an example, imagine you hash not just your password, but the combination of username and password;
password
will generate the hash5f4dcc3b5aa765d61d8327deb882cf99
which will probably be in every rainbow table, whilekhkhkkPassword
will generatebe2d1a6255d12f44b8a44f25aea41516
, which will probably not by in any of them.Those are the basics, and the principles described here for MD5 should be the same for other algorithms. Note however that MD5 has been considered insecure for a long time, and that other more robust options should be used. There are several options, and in many cases there will be tools or libraries and best practices available for whatever programming language or framework you are working with, which can help simplify the choice and implementation.
When checking a username / password combination, the login information is sent to the server via
POST
. The server-side scripts will then (if built correctly) reapply the same logic that they used to hash the information in the first place.For example, if a script calls
md5($password)
during registration, the user's password will be stored in the database in MD5. When attempting to login as this user, the login script mustPOST
the same registration hashing method - in this case,md5($password)
, so that the two MD5 strings can be compared against one another.In this way, you're comparing the hashed values. Because applying the same hashing mechanism to the same string will always give the same hashed output, by simply comparing the same hash you'll be comparing the same original string. This is how you know whether the user got the right password or not.
Also, keep in mind that encryption and hashing are not the same thing. If you know the key used in encryption, decryption becomes trivial. Conversely, a hash is a one-way encryption, so the only way to 'crack' a hash is with brute force (guessing what the password might be and then comparing the hashed outputs).
Finally, note that MD5 is highly insecure (even with a salt) and should NOT be used for password storage. Instead, you should consider
password_hash()
andpassword_verify()
(if using PHP) along with a secure algorithm likePASSWORD_BCRYPT
orPASSWORD_ARGON2I
.