I wanted to port a state-of-the-art hash function MeiYan from C to Go. (As far as I know this is one of the best if not just the best hash function for hash tables in terms of speed and collision rate, it beats MurMur at least.)
I am new to Go, just spent one weekend with it, and came up with this version:
func meiyan(key *byte, count int) uint32 {
type P *uint32;
var h uint32 = 0x811c9dc5;
for ;count >= 8; {
a := ((*(*uint32)(unsafe.Pointer(key))) << 5)
b := ((*(*uint32)(unsafe.Pointer(key))) >> 27)
c := *(*uint32)(unsafe.Pointer(uintptr(unsafe.Pointer(key)) + 4))
h = (h ^ ((a | b) ^ c)) * 0xad3e7
count -= 8
key = (*byte)(unsafe.Pointer(uintptr(unsafe.Pointer(key)) + 8))
}
if (count & 4) != 0 {
h = (h ^ uint32(*(*uint16)(unsafe.Pointer(key)))) * 0xad3e7
key = (*byte)(unsafe.Pointer(uintptr(unsafe.Pointer(key)) + 2))
h = (h ^ uint32(*(*uint16)(unsafe.Pointer(key)))) * 0xad3e7
key = (*byte)(unsafe.Pointer(uintptr(unsafe.Pointer(key)) + 2))
}
if (count & 2) != 0 {
h = (h ^ uint32(*(*uint16)(unsafe.Pointer(key)))) * 0xad3e7
key = (*byte)(unsafe.Pointer(uintptr(unsafe.Pointer(key)) + 2))
}
if (count & 1) != 0 {
h = (h ^ uint32(*key));
h = h * 0xad3e7
}
return h ^ (h >> 16);
}
Looks messy, but I do not think I can make it look better. Now I measure the speed and it is frustratingly slow, 3 times slower than C/C++ when compiled with gccgo -O3
. Can this be made faster? Is this just as good as compiler can make it or unsafe.Pointer
conversion is just as slow as it gets? In fact this surprised me, because I have seen that some other number crunching style code was just as fast as C or even faster. Am I doing something inneficiently here?
Here is the original C code I am porting from:
u32 meiyan(const char *key, int count) {
typedef u32* P;
u32 h = 0x811c9dc5;
while (count >= 8) {
h = (h ^ ((((*(P)key) << 5) | ((*(P)key) >> 27)) ^ *(P)(key + 4))) * 0xad3e7;
count -= 8;
key += 8;
}
#define tmp h = (h ^ *(u16*)key) * 0xad3e7; key += 2;
if (count & 4) { tmp tmp }
if (count & 2) { tmp }
if (count & 1) { h = (h ^ *key) * 0xad3e7; }
#undef tmp
return h ^ (h >> 16);
}
Here is how I measure speed:
func main(){
T := time.Now().UnixNano()/1e6
buf := []byte("Hello World!")
var controlSum uint64 = 0
for x := 123; x < 1e8; x++ {
controlSum += uint64(meiyan(&buf[0], 12))
}
fmt.Println(time.Now().UnixNano()/1e6 - T, "ms")
fmt.Println("controlSum:", controlSum)
}