Can anyone, please, explain to me how to transform a phrase like "I want to buy some milk" into MD5? I read Wikipedia article on MD5, but the explanation given there is beyond my comprehension:
"MD5 processes a variable-length
message into a fixed-length output of
128 bits. The input message is broken
up into chunks of 512-bit blocks
(sixteen 32-bit little endian
integers)"
"sixteen 32-bit little endian integers" is already hard for me. I checked the Wiki article on little endians and didn't understand a bit.
However, the examples of some phrases and their MD5 hashes in that Wiki article are very nice:
MD5("The quick brown fox jumps over
the lazy dog") =
9e107d9d372bb6826bd81d3542a419d6
MD5("The quick brown fox jumps over
the lazy dog.") =
e4d909c290d0fb1ca068ffaddf22cbd0
Can anyone, please, explain to me how this MD5 algorithm works using some very simple example?
And also, perhaps you know some software or a code that would transform phrases into their MD5. If yes, please, let me know.
Forget about the endians: it's just a way name for a way to encode information.
Let's follow the wikipedia MD5 article. You start with an input message. It can be arbitrarily long: MD5 hashes for 2GB ISO files are routinely created, just like hashes for strings a dozen characters long (e.g. for passwords).
The hash will be contained in registers a
, b
, c
and d
. These registers are initialized with special values (h0-h3
).
The algorithm breaks the input into 16 4-byte chunks ("sixteen 32-bit little-endian words") and applies specific logical operations (functions F
, G
, H
and I
) on parts of the input and the current state of registers a
, b
, c
and d
. It does this 64 times for each set of 16 4-byte chunks.
When all of the chunks are processed, what remains in a
, b
, c
and d
is the final hash, the one you might get by invoking md5sum testfile.txt
.
Update:
If you just want to be able to calculate a hash, implementing it yourself makes no sense because it's been done and tested for probably every significant language out there:
Python:
import md5
md5.new("Nobody inspects the spammish repetition").digest()
SQL (MySQL):
SELECT MD5('Nobody inspects the spammish repetition')
Java:
String s="Nobody inspects the spammish repetition";
MessageDigest m=MessageDigest.getInstance("MD5");
m.update(s.getBytes(),0,s.length());
System.out.println(new BigInteger(1,m.digest()).toString(16));
etc.
Md5 is a hash algorithm: It produces a signature of the input text such that changing any letter in the input will have significant, unpredictable impact on the signature.
For instance:
The md5 signature of the text 'This is a quite short text which looks quite normal' is '2bb1a5a5204aba95c886b3eb598c9d41'
The md5 signature of the same text with an added period, 'This is a quite short text which looks quite normal.' is '870df12558aae47b40bf738290ba8554'
As you see, there signature differs significantly. This property makes md5 suitable as a type of 'fingerprinting': Two books who only differ by one letter have completely different md5s. Futhermore, two md5s are almost never the same for any pair of different books: collisions are extremely rare.
There are numerous implementations of md5, including several online versions (here is one). If you want one in a specific language, please specify which.
MD5 is horribly broken and has been for years. Do not use for any purpose if you can possibly help it. In new applications, use a SHA-2 hash function such as SHA-256.