Why do we check hash if we are going to check equa

2019-08-04 04:37发布

问题:

If two objects are equal then hashcode must be same. Then why does the any check in HashMap do -

 if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k)))) {

Instead of simply

if ((k = e.key) == key || (key != null && key.equals(k)))) {

回答1:

If two objects are equal then hashcode must be same.

In this case, take it the other way: "If hashcodes of two objects are different, they can't be equal"

So, here we are simply short-circuiting the comparison using equals() by first comparing the hashes.

Since hash is of type int, comparing 2 ints is not an expensive operation (Just uses a single machine instruction - if_icmp<cond>.

On the other hand, equals() method for various objects might involve complex operations, of course making it an expensive operation in compared to int comparison. So, we just do hash comparison for earlier exit.



回答2:

Because the hash check is cheap, and the equals() method call might be expensive. If the hash check fails, we don't need to bother with the equals() check to return false, so we save time.



回答3:

On top of what @MarounMaroun said, another advantage of hash is that it returns an int. This lets you use it as an index into an array (which is how the implementation of a hash table works). equals returns a boolean, so it can't be used this way.



回答4:

If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.

However When it comes to Hashcode, We should be aware of

  • If two objects are equal, they will have same hashcodes

  • If the two objects have same hashcode , that doesn't mean that they
    must be equal

  • Also note that, if two objects are not equal, even then they can have the same hash code.

You Can understand with this example.

Lets Consider two Numbers 10,40 and Hashcode Logic is calculated using %(Mod) 6 so. ex 10 % 6 = 4 and 40 % 6 = 4

Although 10,40 are different but returning Same HashCode.

However if i change my Hashcode logic to use % (Mod) 37 then, ex 10 % 37 = 10 and 40 % 37 = 3

Now With this Hash logic. 10 & 40 are different!

Hence Hashcode basically depends on your logic to calculate Hash.



回答5:

In situations where the hash values of two objects to be compared will always be known, and objects which are not reference-identical will often have different hash values, comparing them will often be faster than comparing the objects themselves, sometimes considerably so, and will never be 'much' slower. For example, if one wishes to compare two 10,000-character strings which are identical except for the last few characters, and if one knows the hash values of the two strings, checking to see whether the hash values match will be cheaper than examining enough characters of each string to find the first difference.

If the hash values are not known, computing them will generally be slower than direct comparison of the objects. If they will sometimes be known, checking whether they are known and using them if so will sometimes be helpful and sometimes not, depending upon how frequently the objects are involved in comparisons with other things whose hash values happen to be different, and upon how long the comparisons in such cases would take. Note also that if collections of arbitrary-type objects will need to support direct comparison with other collections to see if they hold identical content, it may be advantageous for them to compute and cache the hash values of all the items they contain, as well a composite hash which combines the hashes of all the values. If one has two large collections of thousand-character strings which might differ except for the last few characters of the last string, being able to know that they're unequal without having to check almost every character of every string can be a big win.