Why would Microsoft want NOT to fix the wrong impl

2019-03-18 05:31发布

In the .NET Framework, the implementation (override) of Equals(object) and GetHashCode() for floating-point types (System.Double and System.Single) is wrong. To quote from the MSDN GetHashCode(object) specification:

A hash function must have the following properties:

• If two objects compare as equal, the GetHashCode method for each object must return the same value. However, if two objects do not compare as equal, the GetHashCode methods for the two object do not have to return different values.

If you take two NaN values with different binary representations, the two objects do compare equal under the Equals method, but the hash codes are (almost always) distinct.

Now, this error has been reported on Microsoft Connect. But why will they not fix this?

The fix is easy: Either let different NaN not compare as equal, or choose a fixed hash code to return for any NaN.

The fix won't break anything: The way things are today, nothing works when different NaN are used.

Can you think of any reason not to fix this?

Here's a simple example illustrating the current behavior:

using System;
using System.Collections.Generic;
using System.Linq;

static class Program
{
  const int setSize = 1000000; // change to higher value if you want to waste even more memory
  const double oneNaNToRuleThemAll = double.NaN;
  static readonly Random randomNumberGenerator = new Random();

  static void Main()
  {
    var set = new HashSet<double>();   // uses default EqualityComparer<double>

    while (set.Count < setSize)
      set.Add(GetSomeNaN());

    Console.WriteLine("We now have a set with {0:N0} members", set.Count);
    bool areAllEqualToTheSame = set.All(oneNaNToRuleThemAll.Equals);
    if (areAllEqualToTheSame)
      Console.WriteLine("By transitivity, all members of the set are (pairwise) equal.");
  }

  static double GetSomeNaN()  // can also give PositiveInfinity, NegativeInfinity (unlikely)
  {
    byte[] b = new byte[8];
    randomNumberGenerator.NextBytes(b);
    b[7] |= 0x7F;
    b[6] |= 0xF0;
    return BitConverter.ToDouble(b, 0);
  }
}

Result of running the code: One million duplicates in a HashSet<>.

PLEASE NOTE: This has nothing at all to do with the == and != operators of C#. Please use Equals if you want to check this for yourself.

0条回答
登录 后发表回答