I have a simple class:
public class TileName {
int Zoom, X, Y;
public override bool Equals (object obj)
{
var o = obj as TileName;
return (o != null) && (o.Zoom == Zoom) && (o.X == X) && (o.Y == Y);
}
public override int GetHashCode ()
{
return (Zoom + X + Y).GetHashCode();
}
}
I was curious if I would get a better distribution of hash codes if I instead did something like:
public override int GetHashCode ()
{
return Zoom.GetHashCode() + X.GetHashCode() + Y.GetHashCode();
}
This class is going to be used as a Dictionary key, so I do want to make sure there is a decent distribution.
Neither of the implementations in your question are ideal. For example, they'll return exactly the same hash for
{ Zoom=1, X=2, Y=3 }
,{ Zoom=2, X=3, Y=1 }
,{ Zoom=3, X=1, Y=2 }
etc etc.I usually use something like this:
(From memory, I think the C# compiler uses something similar when it generates the
GetHashCode
methods for anonymous types.)I've actually found this to be really effective.
Like described by Jon Skeet in this SO answer, it is best practice to pick some prime numbers and multiply these with the single hash codes, then sum everything up.
The problems with
xor
hashes are:X
is equal toY
then your hash will be just Zoom, because thenX ^ Y = X ^ X = 0
holdsxor
is a symmetric operator, it will produce the exact same hashes for the objects[Zoom = 3, X = 5, Y = 7]
,[Zoom = 3, X = 7, Y = 5]
,[Zoom = 7, X = 5, Y = 3]
etc.These facts make the xor-method more likely to cause collisions.
In addition to Jons post, consider using a
unchecked
context, for explicitly ignoring overflows. Because like the MSDN says:So while usually overflows will be unchecked, it may be that it fails somewhen in some environment or built with some compiler option. But in this case you want to explicitly not check these overflows.
Update:
By the way:
someInt.GetHashCode()
returnssomeInt
. Like this, it is of course the fastest possible and a perfect hash distribution without a single collision. How else would you map an int to an int-hash? :) So what I wanted to say: Your first approach:and your second one:
are exactly the same. You dont even have to call
GetHashCode
and both are very likely to have collisions. Maybe even worse than thexor
method, if you very likely have small integer values for all three ints.Update 2:
As I wrote in the comment to ChaosPandions post: If you just have those three int values, and
X
,Y
andZoom
are relatively small numbers (smaller than 1000 or 10000) this one may be also a good hash generator:It just distributes the bits in the hash value (example in big-endian for readability):