Implement “tolerant” `equals` & `hashCode` for a c

2020-06-03 03:37发布

问题:

I have a class with a float field. For example:

public class MultipleFields {
  final int   count;
  final float floatValue;

  public MultipleFields(int count, float floatValue) {
    this.count = count;
    this.floatValue = floatValue;
  }

}

I need to be able to compare instances by value. Now how do I properly implement equals & hashCode?

The usual way to implement equals and hashCode is to just consider all fields. E.g. Eclipse will generate the following equals:

  public boolean equals(Object obj) {
    // irrelevant type checks removed
    ....
    MultipleFields other = (MultipleFields) obj;
    if (count != other.count)
      return false;
    if (Float.floatToIntBits(floatValue) != Float.floatToIntBits(other.floatValue))
      return false;
    return true;
  }

(and a similar hashCode, that essentially computes count* 31 + Float.floatToIntBits(floatValue)).

The problem with this is that my FP values are subject to rounding errors (they may come from user input, from a DB, etc.). So I need a "tolerant" comparison.

The common solution is to compare using an epsilon value (see e.g. Comparing IEEE floats and doubles for equality). However, I'm not quite certain how I can implement equals using this method, and still have a hashCode that is consisten with equals.

My idea is to define the number of significant digits for comparison, then always round to that number of digits in both equals and hashCode:

long comparisonFloatValue = Math.round(floatValue* (Math.pow(10, RELEVANT_DIGITS)));

Then if I replace all uses of floatValue with comparisonFloatValue in equals and hashCode, I should get a "tolerant" comparison, which is consistent with hashCode.

  • Will this work?
  • Do you see any problems with this approach?
  • Is there a better way to do this? It seems rather complicated.

回答1:

The big problem with it is that two float values could still be very close together but still compare unequal. Basically you're dividing the range of floating point values into buckets - and two values could be very close together without being in the same bucket. Imagine you were using two significant digits, applying truncation to obtain the bucket, for example... then 11.999999 and 12.000001 would be unequal, but 12.000001 and 12.9999999 would be equal despite being much further apart from each other.

Unfortunately, if you don't bucket values like this, you can't implement equals appropriately because of transitivity: x and y may be close together, y and z may be close together, but that doesn't mean that x and z are close together.