Currently Boost has hash_combine function that outputs 32 bit unsigned integer (to be precise, size_t). Some references:
http://www.boost.org/doc/libs/1_43_0/doc/html/hash/reference.html#boost.hash_combine
http://www.boost.org/doc/libs/1_43_0/doc/html/hash/combine.html
Magic number in boost::hash_combine
I'd like to explore on how to create 64 bit version of hash_combine.
The first thing is to get golden ratio or any other irrational number in 64 bit.
The second part is to use shifts. This part rather tricky and I'd like to ask if there are best practices or guide on using shifts to get hash values? Or choosing shifts like the original code:
seed ^= hash_value(v) + 0x9e3779b9 + (seed << 6) + (seed >> 2);
is totally random?
Also how to evaluate the output of hash_combine
to make sure that it doesn't create more collisions than the original hash function hash_value
?
Read http://burtleburtle.net/bob/hash/doobs.html for some basic information on hash function design, and the rest of the articles in http://burtleburtle.net/bob/hash/ for more detailed information. CityHash was tested using http://code.google.com/p/smhasher/, and you can probably test your
hash_combine
using the same testsuite.Although I'm not an expert in hashing, the designs of recent hash functions lead me to believe that the 2-shift technique boost's
hash_combine()
uses is no longer state-of-the-art and can be improved on.If you only want a hash_combine that hashes 2 64 bit values into one, and you don't need a new hash function for strings, you could just lift a tiny bit of code from CityHash, something like this (assuming size_t is a 64 bit unsigned integer, add your favorite bit of preprocessor or template trickery to validate that):
(I think reproducing this snippet here and elsewhere is OK because it doesn't constitute a 'substantial portion' of the CityHash code, but please check the CityHash sources & license agreement to decide for yourself)