.Net GetHashcode Bit Shifting Operation

I was looking through some of the .net source yesterday and saw several implementations of GetHashcode with something along the lines of this:

(i1 << 5) + i ^ i2

I understand what the code is doing and why. What I want to know is why they used (i1 << 5) + i instead of (i1 << 5) - i.

Most frameworks I've seen use -i because that's equivalent to multiplying by 31 which is prime, but the Microsoft way is equivalent to multiplying by 33 which has 11 and 3 as factors and thus isn't prime.

Is there a known justification for this? Any reasonable hypotheses?

标签： .net hashcode bit-shift

2条回答

贪生不怕死

2楼-- · 2019-03-18 08:44

I don't remember if 31 is one of those primes, but there are certain primes which get used as capacities by Dictionary<K,V>. And if you use the left field doesn't influence the chosen bucket anymore and the hash degenerates.

0人赞添加讨论(0) 举报

Animai°情兽

3楼-- · 2019-03-18 08:45

I asked the same question on math.stackexchange.com: Curious Properties of 33.

The conjecture among mathematicians and the research I did on the topic leads me to believe that the answer is this:

Okay, I found out why Microsoft uses 33. That's called the Bernstein Hash. It turns out that 33 has some magical properties that produce a good distribution of hash codes and there's very little theoretical knowledge as to why.

Basically, in entropy and speed comparisons, Bernstein does well enough and is quite snappy. Dan Bernstein, the guy who came up with the constant 33, wasn't able to explain what property of 33 produced such a good distribution of hashes.

Several papers have been written comparing hash functions and have corroborated this finding without further explaining the benefit of using 33. Further, I couldn't find why Java uses 31 instead. It appears to be a mathematical and programming mystery to date.

0人赞添加讨论(0) 举报

.Net GetHashcode Bit Shifting Operation

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间