I just found out murmur hash, seems to be the fastest known and quite collision resistant. I tried to dig more about the algorithm or implementation in full source code, but I am having difficulty understanding it. Could someone here explain the algorithm used, or implement it in full source code, preferably in C. I read the C source code from the author website but has no idea, like: what is seed, h, k, m ?
what does this mean :
k *= m;
k ^= k >> r;
k *= m;
h *= m;
h ^= k;
data += 4;
len -= 4;
???
Reference : http://murmurhash.googlepages.com/
Sorry for my English and my stupidity.
Cheers
The best explanation of the Murmur algorithm is on the Murmur Hash Wikipedia page:
Murmur3_32(key, len, seed)
//Note: In this version, all integer arithmetic is performed
//with unsigned 32 bit integers. In the case of overflow,
//the result is constrained by the application
//of modulo 232 arithmetic.
c1 ← 0xcc9e2d51
c2 ← 0x1b873593
r1 ← 15
r2 ← 13
m ← 5
n ← 0xe6546b64
hash ← seed
for each fourByteChunk of key
k ← fourByteChunk
k ← k × c1
k ← (k ROL r1)
k ← k × c2
hash ← hash XOR k
hash ← (hash ROL r2)
hash ← hash × m + n
with any remainingBytesInKey
remainingBytes ← SwapEndianOrderOf(remainingBytesInKey)
// Note: Endian swapping is only necessary on big-endian machines.
remainingBytes ← remainingBytes × c1
remainingBytes ← (remainingBytes ROL r1)
remainingBytes ← remainingBytes × c2
hash ← hash XOR remainingBytes
hash ← hash XOR len
hash ← hash XOR (hash SHR 16)
hash ← hash × 0x85ebca6b
hash ← hash XOR (hash SRH 13)
hash ← hash × 0xc2b2ae35
hash ← hash XOR (hash SHR 16)
And my own:
The code is available here .
m and r are constants used by the algorithm.
k *= m means take variable k and multiple it by m.
k ^= k >> r means take k and right shift the bits r places (e.g. if r is 2 110101 would become 001101) and then XOR it with k.
Hope that gives you enough to work through the rest.
Regards