Given that collections like System.Collections.Generic.HashSet<>
accept null
as a set member, one can ask what the hash code of null
should be. It looks like the framework uses 0
:
// nullable struct type
int? i = null;
i.GetHashCode(); // gives 0
EqualityComparer<int?>.Default.GetHashCode(i); // gives 0
// class type
CultureInfo c = null;
EqualityComparer<CultureInfo>.Default.GetHashCode(c); // gives 0
This can be (a little) problematic with nullable enums. If we define
enum Season
{
Spring,
Summer,
Autumn,
Winter,
}
then the Nullable<Season>
(also called Season?
) can take just five values, but two of them, namely null
and Season.Spring
, have the same hash code.
It is tempting to write a "better" equality comparer like this:
class NewNullEnumEqComp<T> : EqualityComparer<T?> where T : struct
{
public override bool Equals(T? x, T? y)
{
return Default.Equals(x, y);
}
public override int GetHashCode(T? x)
{
return x.HasValue ? Default.GetHashCode(x) : -1;
}
}
But is there any reason why the hash code of null
should be 0
?
EDIT/ADDITION:
Some people seem to think this is about overriding Object.GetHashCode()
. It really is not, actually. (The authors of .NET did make an override of GetHashCode()
in the Nullable<>
struct which is relevant, though.) A user-written implementation of the parameterless GetHashCode()
can never handle the situation where the object whose hash code we seek is null
.
This is about implementing the abstract method EqualityComparer<T>.GetHashCode(T)
or otherwise implementing the interface method IEqualityComparer<T>.GetHashCode(T)
. Now, while creating these links to MSDN, I see that it says there that these methods throw an ArgumentNullException
if their sole argument is null
. This must certainly be a mistake on MSDN? None of .NET's own implementations throw exceptions. Throwing in that case would effectively break any attempt to add null
to a HashSet<>
. Unless HashSet<>
does something extraordinary when dealing with a null
item (I will have to test that).
NEW EDIT/ADDITION:
Now I tried debugging. With HashSet<>
, I can confirm that with the default equality comparer, the values Season.Spring
and null
will end in the same bucket. This can be determined by very carefully inspecting the private array members m_buckets
and m_slots
. Note that the indices are always, by design, offset by one.
The code I gave above does not, however, fix this. As it turns out, HashSet<>
will never even ask the equality comparer when the value is null
. This is from the source code of HashSet<>
:
// Workaround Comparers that throw ArgumentNullException for GetHashCode(null).
private int InternalGetHashCode(T item) {
if (item == null) {
return 0;
}
return m_comparer.GetHashCode(item) & Lower31BitMask;
}
This means that, at least for HashSet<>
, it is not even possible to change the hash of null
. Instead, a solution is to change the hash of all the other values, like this:
class NewerNullEnumEqComp<T> : EqualityComparer<T?> where T : struct
{
public override bool Equals(T? x, T? y)
{
return Default.Equals(x, y);
}
public override int GetHashCode(T? x)
{
return x.HasValue ? 1 + Default.GetHashCode(x) : /* not seen by HashSet: */ 0;
}
}
Good question.
I just tried to code this:
and execute this like this:
it returns
null
if I do, instead normal
it return
0
, as expected, or simple Spring if we avoid casting toint
.So.. if you do the following:
EDIT
From MSDN
If two objects compare as equal, the GetHashCode method for each object must return the same value. However, if two objects do not compare as equal, the GetHashCode methods for the two object do not have to return different values
In other words: if two objects have same hash code that doesn't mean that they are equal, cause real equality is determined by Equals.
From MSDN again:
It is 0 for the sake of simplicity. There is no such hard requirement. You only need to ensure the general requirements of hash coding.
For example, you need to make sure that if two objects are equal, their hashcodes must always be equal too. Therefore, different hashcodes must always represent different objects (but it's not necessarily true vice versa: two different objects may have the same hashcode, even though if this happens often then this is not a good quality hash function -- it doesn't have a good collision resistance).
Of course, I restricted my answer to requirements of mathematical nature. There are .NET-specific, technical conditions as well, which you can read here. 0 for a null value is not among them.