I see a lot of confusion between hashes and encryption algorithms and I would like to hear some more expert advice about:
When to use hashes vs encryptions
What makes a hash or encryption algorithm different (from a theoretical/mathematical level) i.e. what makes hashes irreversible (without aid of a rainbow tree)
Here are some similar SO Questions that didn't go into as much detail as I was looking for:
What is the difference between Obfuscation, Hashing, and Encryption?
Difference between encryption and hashing
Symmetric Encryption:
Symmetric encryption may also be referred to as shared key or shared secret encryption. In symmetric encryption, a single key is used both to encrypt and decrypt traffic.
Asymmetric Encryption:
Asymmetric encryption is also known as public-key cryptography. Asymmetric encryption differs from symmetric encryption primarily in that two keys are used: one for encryption and one for decryption. The most common asymmetric encryption algorithm is
RSA
.Compared to symmetric encryption, asymmetric encryption imposes a high computational burden, and tends to be much slower. Thus, it isn't typically employed to protect payload data. Instead, its major strength is its ability to establish a secure channel over a nonsecure medium (for example, the Internet). This is accomplished by the exchange of public keys, which can only be used to encrypt data. The complementary private key, which is never shared, is used to decrypt.
Hashing:
Finally, hashing is a form of cryptographic security which differs from encryption. Whereas encryption is a two step process used to first encrypt and then decrypt a message, hashing condenses a message into an irreversible fixed-length value, or hash. Two of the most common hashing algorithms seen in networking are
MD5
andSHA-1
.Read more here:http://packetlife.net/blog/2010/nov/23/symmetric-asymmetric-encryption-hashing/
A Hash function turns a variable-sized amount of text into a fixed-sized text.
Source: https://en.wikipedia.org/wiki/Hash_function
Lets see it in action. I use php for it.
HASH:
DEHASH:
SHA1 is a one-way hash. Which means that you can't dehash the hash. However, you can brute-force the hash. Please see: https://hashkiller.co.uk/sha1-decrypter.aspx.
MD5, is another hash. A MD5 dehasher can be found on this website: https://www.md5online.org/.
An Encryption function transforms a text into a nonsensical ciphertext by using an encryption key, and vice versa.
Source: https://en.wikipedia.org/wiki/Encryption
--- Example: The Mcrypt extention in PHP ---
ENCRYPT:
DECRYPT:
--- Example: The OpenSSL extention in PHP ---
The Mcrypt extention was deprecated in 7.1. and removed in php 7.2. The OpenSSL extention should be used in php 7. See the code snippets below:
Basic overview of hashing and encryption/decryption techniques are.
UPDATE: To address the points mentioned in the edited question.
A hash function could be considered the same as baking a loaf of bread. You start out with inputs (flour, water, yeast, etc...) and after applying the hash function (mixing + baking), you end up with an output: a loaf of bread.
Going the other way is extraordinarily difficult - you can't really separate the bread back into flour, water, yeast - some of that was lost during the baking process, and you can never tell exactly how much water or flour or yeast was used for a particular loaf, because that information was destroyed by the hashing function (aka the oven).
Many different variants of inputs will theoretically produce identical loaves (e.g. 2 cups of water and 1 tsbp of yeast produce exactly the same loaf as 2.1 cups of water and 0.9tsbp of yeast), but given one of those loaves, you can't tell exactly what combo of inputs produced it.
Encryption, on the other hand, could be viewed as a safe deposit box. Whatever you put in there comes back out, as long as you possess the key with which it was locked up in the first place. It's a symmetric operation. Given a key and some input, you get a certain output. Given that output, and the same key, you'll get back the original input. It's a 1:1 mapping.
Use hashes when you don't want to be able to get back the original input, use encryption when you do.
Hashes take some input and turn it into some bits (usually thought of as a number, like a 32 bit integer, 64 bit integer, etc). The same input will always produce the same hash, but you PRINCIPALLY lose information in the process so you can't reliably reproduce the original input (there are a few caveats to that however).
Encryption principally preserves all of the information you put into the encryption function, just makes it hard (ideally impossible) for anyone to reverse back to the original input without possessing a specific key.
Simple Example of Hashing
Here's a trivial example to help you understand why hashing can't (in the general case) get back the original input. Say I'm creating a 1-bit hash. My hash function takes a bit string as input and sets the hash to 1 if there are an even number of bits set in the input string, else 0 if there were an odd number.
Example:
Note that there are many input values that result in a hash of 0, and many that result in a hash of 1. If you know the hash is 0, you can't know for sure what the original input was.
By the way, this 1-bit hash isn't exactly contrived... have a look at parity bit.
Simple Example of Encryption
You might encrypt text by using a simple letter substitution, say if the input is A, you write B. If the input is B, you write C. All the way to the end of the alphabet, where if the input is Z, you write A again.
Just like the simple hash example, this type of encryption has been used historically.