We ran into a magic decimal number that broke our hashtable. I boiled it down to the following minimal case:
decimal d0 = 295.50000000000000000000000000m;
decimal d1 = 295.5m;
Console.WriteLine("{0} == {1} : {2}", d0, d1, (d0 == d1));
Console.WriteLine("0x{0:X8} == 0x{1:X8} : {2}", d0.GetHashCode(), d1.GetHashCode()
, (d0.GetHashCode() == d1.GetHashCode()));
Giving the following output:
295.50000000000000000000000000 == 295.5 : True
0xBF8D880F == 0x40727800 : False
What is really peculiar: change, add or remove any of the digits in d0 and the problem goes away. Even adding or removing one of the trailing zeros! The sign doesn't seem to matter though.
Our fix is to divide the value to get rid of the trailing zeroes, like so:
decimal d0 = 295.50000000000000000000000000m / 1.000000000000000000000000000000000m;
But my question is, how is C# doing this wrong?
The documetation suggests that because of
GetHashCode()
being unpredictable, you should create your own. It's considered unpredictable because each Type has it's own implementation and since we don't know the internals of it we should create our own according to how we evaluate uniqueness.However, I think the answer is that
GetHashCode()
is not using the mathematical decimal value to create the hash code.Mathematically we see 295.50000000 and 295.5 as being the same. When you look at the decimal objects in the IDE this is true too. However, if you do a
ToString()
on both decimals you will see that the compiler sees them differently, i.e. you will still see 295.50000000.GetHashCode()
is evidently not using the mathematical representation of the decimal for creating the hash code.Your fix is simply creating a new decimal without all the trailing zeros which is why it works.
Another bug (?) that results in different bytes representation for the same decimal on different compilers: Try to compile following code on VS 2005 and then VS 2010. Or look at my article on Code Project.
Some people use following normalization code
d=d+0.0000m
which is not working properly on VS 2010. Your normalization code (d=d/1.000000000000000000000000000000000m
) looks good - I use the same one to get the same byte array for the same decimals.To start with, C# isn't doing anything wrong at all. This is a framework bug.
It does indeed look like a bug though - basically whatever normalization is involved in comparing for equality ought to be used in the same way for hash code computation. I've checked and can reproduce it too (using .NET 4) including checking the
Equals(decimal)
andEquals(object)
methods as well as the==
operator.It definitely looks like it's the
d0
value which is the problem, as adding trailing 0s tod1
doesn't change the results (until it's the same asd0
of course). I suspect there's some corner case tripped by the exact bit representation there.I'm surprised it isn't (and as you say, it works most of the time), but you should report the bug on Connect.
This is a decimal rounding error.
Too much precision is required to set d0 with the .000000000000000, as a consequence the algorithm in charge of it makes a mistake and ends up giving a different result. It could be classified as a bug in this example, although note that "decimal" type is supposed to have a precision of 28 digits, and here, you are actually requiring a precision of 29 digits for d0.
This can be tested by asking for the full raw hexadecimal representation of d0 and d1.
Ran into this bug too ... :-(
Tests (see below) indicate that this depends on the maximum precision available for the value. The wrong hash codes only occur near the maximum precision for the given value. As the tests show the error seems to depend on the digits left of the decimal point. Sometimes the only the hashcode for maxDecimalDigits - 1 is wrong, sometimes the value for maxDecimalDigits is wrong.
I tested this in VB.NET (v3.5) and got the same thing.
The interesting thing about the hash codes :
A) 0x40727800 = 1081243648
B) 0xBF8D880F = -1081243648
Using Decimal.GetBits() I found
format : Mantissa (hhhhhhhh hhhhhhhh hhhhhhhh) Exponent(seee0000) (h is values, 's' is sign, 'e' is exponent, 0 must be zeros)
d1 ==> 00000000 00000000 00000B8B - 00010000 = (2955 / 10 ^ 1) = 295.5
do ==> 5F7B2FE5 D8EACD6E 2E000000 - 001A0000
...which converts to 29550000000000000000000000000 / 10^26 = 295.5000000...etc
** edit : ok, I wrote a 128-bit hex-decimal calculator and the above is exactly correct
It definitely looks like an internal conversion bug of some sort. Microsoft explicitly states that they do not guarantee their default implementation of GetHashCode. If you are using it for anything important then it probably makes sense to write your own GetHashCode for the decimal type. Formatting it to a fixed decimal, fixed width string and hashing seems to work, for example (>29 decimal places, > 58 width - fits all possible decimals).
* edit : I don't know about this anymore. It still must be a conversion error somewhere since the stored precision fundamentally changes the real value in memory. That the hash codes end up as signed negatives of each other is a big clue - would need to look further into the default hash code implementation to find more.
28 or 29 digits shouldn't matter unless there is dependent code which does not evaluate the outer extents properly. The largest 96-bit integer accessible is :
79228162514264337593543950335
so you can have 29 digits so long as the whole thing (without decimal point) is less than this value. I can't help but think that this is something much more subtle in the hash code calculation somewhere.