What is md5() for?

2019-03-08 13:34发布

站内文章 / PHP

13 0

三岁会撩人

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I was reading this tutorial for a simple PHP login system.

In the end it recommends that you should encrypt your password using md5().

Though I know this is a beginners' tutorial, and you shouldn't put bank statements behind this login system, this got me thinking about encryption.

So I went ahead and went to (one of the most useful questions this site has for newbies): What should a developer know before building a public web site?

There it says (under security) you should:

~~Encrypt~~ Hash and salt passwords rather than storing them plain-text.

It doesn't say much more about it, no references.

So I went ahead and tried it myself:

$pass = "Trufa";
$enc = md5($pass);

echo $enc; #will echo 06cb51ce0a9893ec1d2dce07ba5ba710

And this is what got me thinking, that although I know md5() might not the strongest way to encrypt, anything that always produces the same result can be reverse engineered.

So what is the sense of encrypting something with md5() or any other method?

If a hacker gets to a password encrypted with md5(), he would just use this page!.

So now the actual questions:

How does password encryption work?

I know I have not discovered a huge web vulnerability here! :) I just want to understand the logic behind password encryption.

I'm sure I'm understanding something wrong, and would appreciate if you could help me set my though and other's (I hope) straight.

How would you have to apply password encryption so that it is actually useful?

What about this idea?

As I said, I may/am getting the whole idea wrong, but, would this method add any security in security to a real environment?

$reenc = array(
 "h38an",
 "n28nu",
 "fw08d"
 );

$pass = "Trufa";

$enc = chunk_split(md5($pass),5,$reenc[mt_rand(0,count($reenc)-1)]);

echo $enc;

As you see, I randomly added arbitrary strings ($reenc = array()) to my md5() password "making it unique". This of course is just a silly example.

I may be wrong but unless you "seed the encryption yourself" it will always be easily reversible.

The above would be my idea of "password protecting" and encrypted password, If a hacker gets to it he wont be able to decrypt it unless he gets access to the raw .php

I know this might not even make sense, but I can't figure out why this is a bad idea!

I hope I've made myself clear enough, but this is a very long question so, please ask for any clarification needed!

Thanks in advance!!

回答1:

You should have an encryption like md5 or sha512. You should also have two different salts, a static salt (written by you) and then also a unique salt for that specific password.

Some sample code (e.g. registration.php):

$unique_salt = hash('md5', microtime()); 
$password = hash('md5', $_POST['password'].'raNdoMStAticSaltHere'.$unique_salt);

Now you have a static salt, which is valid for all your passwords, that is stored in the .php file. Then, at registration execution, you generate a unique hash for that specific password.

This all ends up with: two passwords that are spelled exactly the same, will have two different hashes. The unique hash is stored in the database along with the current id. If someone grab the database, they will have every single unique salt for every specific password. But what they don't have is your static salt, which make things a lot harder for every "hacker" out there.

This is how you check the validity of your password on login.php for example:

$user = //random username;
$querysalt = mysql_query("SELECT salt FROM password WHERE username='$user'");
while($salt = mysql_fetch_array($querysalt)) {
    $password = hash('md5',
          $_POST['userpassword'].'raNdoMStAticSaltHere'.$salt[salt]);
}

This is what I've used in the past. It's very powerful and secure. Myself prefer the sha512 encryption. It's actually just to put that inside the hash function instead of md5 in my example.

If you wanna be even more secure, you can store the unique salt in a completely different database.

回答2:

Firstly, "hashing" (using a cryptographic one way function) is not "encrypting". In encryption, you can reverse the process (decryption). In hashing, there is (theoretically) no feasible way of reversing the process.

A hash is some function f such that v cannot be determined from f(v) easily.

The point of using hashing for authentication is that you (or someone seeing the hash value) do not have any feasible way (again, theoretically) of knowing the password. However, you can still verify that the user knows his password. (Basically, the user proves that he knows v such that f(v) is the stored hash).

The weakness of simply hashing (aside from weak hash functions) is that people can compile tables of passwords and their corresponding hash and use them to (effectively) get the inverse of the hash function. Salting prevents this because then a part of the input value to the hash is controlled and so tables have to be compiled for that particular salt.

So practically, you store a salt and a hash value, and authenticate by hashing a combination of the salt and the password and comparing that with your hash value.

回答3:

MD5 is a one way hashing function which will guard your original password more or less safely.

So, let's say your password is "Trufa", and its hashed version is 06cb51ce0a9893ec1d2dce07ba5ba710.

For example, when you sign in to a new webpage, they ask you for your username and password. When you write "Trufa" as your password, the value 06cb51ce0a9893ec1d2dce07ba5ba710 is stored in the database because it is hashed.

The next time you log in, and you write "Trufa", the hashed value will be compared to the one in the database. If they are the same, you are authenticated! Providing you entered the right username, of course.

If your password wasn't stored in its hashed form in database, some malicious person might run a query somehow on that database and see all real passwords. And that would be compromising.

Also, since MD5 is a 128 bit cryptographic function, there are 2^128-1 = 340282366920938463463374607431768211455 possible combinations.

Since there are more possible strings than this, it is possible that 2 strings will generate the same hash value. This is called a collision. And it makes sure that a hashed password cannot be uniquely reverse engineered.

回答4:

The only vulnerability with salting is that you need to know what the salt is in order to reconstruct the hash for testing the password. This is gotten around by storing the entry in the authdb in the form <algorithm>$<salt>$<hash>. This way the authdb entry can be used by any code that has access to it.

回答5:

You're missing the important step - the salt. This is a unique (per user, ideally) bit of extra data that you add to the password before hashing it.

http://en.wikipedia.org/wiki/Salt_%28cryptography%29

回答6:

Your idea (salting) is well known and is actually well-implemented in the PHP language. If you use the crypt() function it allows you to specify a string to hash, a method to encrypt (in some cases), and a salt. For example,

$x = crypt('insecure_password', $salt);

Returns a hashed and salted password ready for storage. Passwords get cracked the same way that we check if they're right: we check the hash of what the user inputs against the hash of their password in the database. If they match, they're authenticated (AFAIK this is the most common way to do this, if not the only). Insecure passwords (like password) that use dictionary words can be cracked by comparing their hash to hashes of common passwords. Secure passwords cannot be cracked this way, but can still be cracked. Adding a salt to the password makes it much more difficult to crack: since the hacker most likely doesn't know what the salt is, his dictionary attack won't work.

回答7:

For a decent hash the attacker won't be reversing the hash, they'll be using a rainbow table, which is essentially a brute-force method made useful if everyone uses the same hash function.

The idea of a rainbow table is that since hashing is fast I can hash every possible value you could use as a password, store the result, and have a map of which hash connects to which password. If everyone just takes their passwords and hashes them with MD5 then my hash table is good for any set of password hashes I can get my hands on!

This is where salting comes in. If I take the password the user enters and add some data which is different for every user, then that list of pre-determined hashes is useless since the hash is of both the password and some random data. The data for the salt could be stored right beside the password and even if I get both it doesn't help me get the password back since I still have to essentially brute force the hash separately for every single user - I can't form a single rainbow table to attack all the hashes at once.

Of course, ideally an attacker won't get the list of hashed passwords in the first place, but some employees will have access so it's not possible to secure the password database entirely.

回答8:

In addition to providing salt (or seed), the md5 is a complex hashing algorithm which uses mathematical rules to produce a result that is specifically not reversable because of the mathematical changes and dataloss in throughput.

http://en.wikipedia.org/wiki/Cryptographic_hash_function

回答9:

md5 (or better put: hash algorithms in general) are used to safely store passwords in database. The most important thing to know about hashes is: Hashes are not encryptions per se. (they are one-way-encryptions at most). If you encrypt something, you can get the data back with the key you used. A hash generates a fixed-length value from an arbitrary input (like a string), which can be used to see if the same input was used.

Hashes are used to store sensitive, repeatly entered data in a storage device. Doing this, nobody can recreate the original input from the hash data, but you can hash an incoming password and compare it to the value in the database, and see if both are the same, if so, the password was correct.

You already pointed out, that there possibilites to break the algorithm, either by using a database of value/hash pairs or producing collisions (different values resulting in the hash value). You can obscure this a bit by using a salt, thus modifying the algorithm. But if the salt is known, it can be used to break the algorithm again.

回答10:

I like this question. But I think you've really answered yourself.

The site you referenced uses dictionary lookups of known, unsalted, md5's - it doesn't "crack" anything.

Your example is almost good, except your application needs to be able to regenerate the md5 using the same salt every time.

Your example appears to use one of the random salts, which will fail 2 of 3 times if you try to compare a users password hash to something input.

People will tell you to also use SHA1 or SHA256 to be have a 'stronger' hash - but people will also argue that they're all 'broken.'

回答11:

That documentation is misleading -- it teaches a "vulnerable" concept and presents it as somehow being "secure" because it (the saved password) looks like gibberish. Just internet junk that won't die. The following link should clear things up (you have already found a good bit of it though, it seems. Good work.)

Enough With The Rainbow Tables: What You Need To Know About Secure Password Schemes talks about MD5 (and why it should not be used) along with salt (e.g. how to thwart rainbow attacks) as well as provides useful insights (such as "Use someone else’s password system. Don’t build your own"). It is a fairly good overview.

回答12:

This is my question about the aspects of md5 collision, slightly related to your question:

Is there any difference between md5 and sha1 in this situation?

The important part is in the first 3 rows, that is: you must put your salt before the password, if you want to achieve stronger protection, not after.

回答13:

To simply answer the title of your question, md5's only real use nowadays is for hashing large strings (such as files) to produce checksums. These are typically used to see if both strings are identical (in terms of files, checksums are frequently used for security purposes to ensure a file being distributed hasn't been tampered with, for example).

To address each of your inline questions:

How does password encryption work? How would you have to apply password encryption so that it is actually useful?

Secure password hashing works by taking the password in plain text form, and then applying a costly hashing function to it, salted with a cryptographically secure random salt to it. See the Secure hash and salt for PHP passwords question for more detail on this.

What about this idea?

Password hashing does not need to be complicated like that, and nor should it be. Avoid thinking up your own algorithms and stick with the tried and tested hashing algorithms already out there. As the question linked above mentions, md5() for password hashing has been obsolete for many years now, and so it should be avoided.

Your method of generating a "random" salt from an array of three different salts is not the randomness you're looking for. You need unique randomness that is suitable for cryptographically secure (i.e. using a cryptically secure pseudo-random number generator (CSPRNG)). If you're using PHP 7 and above, then the random_bytes function can be used to generate a cryptographically secure salt (for PHP 5 users, the random_compat library can be used).