Recently I read through this Developer Works Document.
The document is all about defining hashCode()
and equals()
effectively and correctly, however I am not able to figure out why we need to override these two methods.
How can I take the decision to implement these methods efficiently?
Java puts a rule that
So, if in our class we override
equals()
we should overridehashcode()
method also to follow this rule. Both methods,equals()
andhashcode()
, are used inHashtable
, for example, to store values as key-value pairs. If we override one and not the other, there is a possibility that theHashtable
may not work as we want, if we use such object as a key.Ok, Let me explain the concept in very simple words.
Firstly from a broader perspective we have collections,and hashmap is one of the datastructure in the collections.
To understand why we have to override the both equals and hashcode method, if need to first understand what is hashmap and what is does.
A hashmap is a datastructure which stores key value pairs of data in array fashion. Lets say a[], where each element in 'a' is a key value pair.
Also each index in the above array can be linked list thereby having more than one values at one index.
Now why is a hashmap used? If we have to search among a large array then searching through each if them will not be efficient, so what hash technique tells us that lets pre process the array with some logic and group the elements based on that logic i.e. Hashing
eg: we have array 1,2,3,4,5,6,7,8,9,10,11 and we apply a hash function mod 10 so 1,11 will be grouped in together. So if we had to search for 11 in previous array then we would have to iterate the complete array but when we group it we limit our scope of iteration thereby improving speed. That datastructure used to store all the above information can be thought of as a 2d array for simplicity
Now apart from the above hashmap also tells that it wont add any Duplicates in it. And this is the main reason why we have to override the equals and hashcode
So when its said that explain the internal working of hashmap , we need to find what methods the hashmap has and how does it follow the above rules which i explained above
so the hashmap has method called as put(K,V) , and according to hashmap it should follow the above rules of efficiently distributing the array and not adding any duplicates
so what put does is that it will first generate the hashcode for the given key to decide which index the value should go in.if nothing is present at that index then the new value will be added over there, if something is already present over there then the new value should be added after the end of the linked list at that index. but remember no duplicates should be added as per the desired behavior of the hashmap. so lets say you have two Integer objects aa=11,bb=11. as every object derived from the object class, the default implementation for comparing two objects is that it compares the reference and not values inside the object. So in the above case both though semantically equal will fail the equality test, and possibility that two objects which same hashcode and same values will exists thereby creating duplicates. If we override then we could avoid adding duplicates. You could also refer to Detail working
Bah - "You must override hashCode() in every class that overrides equals()."
[from Effective Java, by Joshua Bloch?]
Isn't this the wrong way round? Overriding hashCode likely implies you're writing a hash-key class, but overriding equals certainly does not. There are many classes that are not used as hash-keys, but do want a logical-equality-testing method for some other reason. If you choose "equals" for it, you may then be mandated to write a hashCode implementation by overzealous application of this rule. All that achieves is adding untested code in the codebase, an evil waiting to trip someone up in the future. Also writing code you don't need is anti-agile. It's just wrong (and an ide generated one will probably be incompatible with your hand-crafted equals).
Surely they should have mandated an Interface on objects written to be used as keys? Regardless, Object should never have provided default hashCode() and equals() imho. It's probably encouraged many broken hash collections.
But anyway, I think the "rule" is written back to front. In the meantime, I'll keep avoiding using "equals" for equality testing methods :-(
Simply put, the equals-method in Object check for reference equality, where as two instances of your class could still be semantically equal when the properties are equal. This is for instance important when putting your objects into a container that utilizes equals and hashcode, like HashMap and Set. Let's say we have a class like:
We create two instances with the same id:
Without overriding equals we are getting:
Correct? Well maybe, if this is what you want. But let's say we want objects with the same id to be the same object, regardless if it's two different instances. We override the equals (and hashcode):
As for implementing equals and hashcode I can recommend using Guava's helper methods
Equals and Hashcode methods in Java
They are methods of java.lang.Object class which is the super class of all the classes (custom classes as well and others defined in java API).
Implementation:
public boolean equals(Object obj)
This method simply checks if two object references x and y refer to the same object. i.e. It checks if x == y.
It is reflexive: for any reference value x, x.equals(x) should return true.
It is symmetric: for any reference values x and y, x.equals(y) should return true if and only if y.equals(x) returns true.
It is transitive: for any reference values x, y, and z, if x.equals(y) returns true and y.equals(z) returns true, then x.equals(z) should return true.
It is consistent: for any reference values x and y, multiple invocations of x.equals(y) consistently return true or consistently return false, provided no information used in equals comparisons on the object is modified.
public int hashCode()
This method returns the hash code value for the object on which this method is invoked. This method returns the hash code value as an integer and is supported for the benefit of hashing based collection classes such as Hashtable, HashMap, HashSet etc. This method must be overridden in every class that overrides the equals method.
The general contract of hashCode is:
Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified.
This integer need not remain consistent from one execution of an application to another execution of the same application.
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables.
Resources:
JavaRanch
Picture