If two objects are equal then hashcode must be same. Then why does the any check in HashMap
do -
if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k)))) {
Instead of simply
if ((k = e.key) == key || (key != null && key.equals(k)))) {
On top of what @MarounMaroun said, another advantage of hash is that it returns an
int
. This lets you use it as an index into an array (which is how the implementation of a hash table works).equals
returns a boolean, so it can't be used this way.In this case, take it the other way: "If hashcodes of two objects are different, they can't be equal"
So, here we are simply short-circuiting the comparison using
equals()
by first comparing thehash
es.Since
hash
is of typeint
, comparing 2int
s is not an expensive operation (Just uses a single machine instruction -if_icmp<cond>
.On the other hand,
equals()
method for various objects might involve complex operations, of course making it an expensive operation in compared toint
comparison. So, we just dohash
comparison for earlier exit.If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
However When it comes to Hashcode, We should be aware of
If two objects are equal, they will have same hashcodes
If the two objects have same hashcode , that doesn't mean that they
must be equal
Also note that, if two objects are not equal, even then they can have the same hash code.
You Can understand with this example.
Although 10,40 are different but returning Same HashCode.
Now With this Hash logic. 10 & 40 are different!
Hence Hashcode basically depends on your logic to calculate Hash.
Because the hash check is cheap, and the
equals()
method call might be expensive. If the hash check fails, we don't need to bother with theequals()
check to return false, so we save time.In situations where the hash values of two objects to be compared will always be known, and objects which are not reference-identical will often have different hash values, comparing them will often be faster than comparing the objects themselves, sometimes considerably so, and will never be 'much' slower. For example, if one wishes to compare two 10,000-character strings which are identical except for the last few characters, and if one knows the hash values of the two strings, checking to see whether the hash values match will be cheaper than examining enough characters of each string to find the first difference.
If the hash values are not known, computing them will generally be slower than direct comparison of the objects. If they will sometimes be known, checking whether they are known and using them if so will sometimes be helpful and sometimes not, depending upon how frequently the objects are involved in comparisons with other things whose hash values happen to be different, and upon how long the comparisons in such cases would take. Note also that if collections of arbitrary-type objects will need to support direct comparison with other collections to see if they hold identical content, it may be advantageous for them to compute and cache the hash values of all the items they contain, as well a composite hash which combines the hashes of all the values. If one has two large collections of thousand-character strings which might differ except for the last few characters of the last string, being able to know that they're unequal without having to check almost every character of every string can be a big win.