public static void main(String[] args) {
HashSet set = new HashSet();
set.add(new StringBuffer("abc"));
set.add(new StringBuffer("abc"));
set.add(new StringBuffer("abc"));
set.add(new StringBuffer("abc"));
System.out.println(set);
}
Output:
[abc,abc,abc,abc]
Here in above code I added object of StringBuffer("abc")
many times and Set
adds it but Set never adds duplicates.
A hash set works with "buckets". It stores values in those "buckets" according to their hash code. A "bucket" can have several members in it, depending on whether those members are equal, using the
equals(Object)
method.So let's say we construct a hash set with 10 buckets, for argument's sake, and add the integers 1, 2, 3, 5, 7, 11 and 13 to it. The hash code for an int is just the int. We end up with something like this:
The traditional way to use a set is to look and see if a member is in that set. So when we say, "Is 11 in this set?" the hash set will modulo 11 by 10, get 1, and look in the 2nd bucket (we're starting our buckets with 0 of course).
This makes it really, really fast to see if members belong to a set or not. If we add another 11, the hash set looks to see if it's already there. It won't add it again if it is. It uses the
equals(Object)
method to determine that, and of course, 11 is equal to 11.The hash code for a string like "abc" depends on the characters in that string. When you add a duplicate string, "abc", the hash set will look in the right bucket, and then use the
equals(Object)
method to see if the member is already there. Theequals(Object)
method for string also depends on the characters, so "abc" equals "abc".When you use a StringBuffer, though, each StringBuffer has a hash code, and equality, based on its Object ID. It doesn't override the basic
equals(Object)
andhashCode()
methods, so every StringBuffer looks to the hash set like a different object. They're not actually duplicates.When you print the StringBuffers to the output, you're calling the toString() method on the StringBuffers. That makes them look like duplicate strings, which is why you're seeing that output.
This is also why it's very important to override
hashCode()
if you overrideequals(Object)
, otherwise the Set looks in the wrong bucket and you get some very odd and unpredictable behavior!StringBuffer
doesn't override eitherequals
orhashCode
- so each object is only equal to itself.This makes sense as
StringBuffer
is very much "mutable by design" - and equality can cause problems when two mutable objects are equal to each other, as one can then change. Using mutable objects as keys in a map or part of a set can cause problems. If you mutate one after insertion into the collection, that invalidates the entry in the collection as the hash code is likely to change. For example, in a map you wouldn't even be able to look up the value with the same object as the key, as the first test is by hash code.StringBuffer
(andStringBuilder
) are designed to be very transient objects - create them, append to them, convert them to strings, then you're done. Any time you find yourself adding them to collections, you need to take a step back and see whether it really makes sense. Just occasionally it might do, but usually only when the collection itself is shortlived.You should consider this in your own code when overriding
equals
andhashCode
- it's very rarely a good idea for equality to be based on any mutable aspect of an object; it makes the class harder to use correctly, and can easily lead to subtle bugs which can take a long time to debug.Did it occur to you to see the equals() method (or the lack of it) in the StringBuffer? There lies the answer for you.
A Set or for that matter any hash based collection depends on the contract exposed by the equals() and hashcode() method on the Object for their behavior characteristic.
In your case since StringBuffer doesn't override these methods each StringBuffer instance that you create is different i.e new StringBuffer("abc") == new StringBuffer("abc") will return false.
I am curious as to why would someone add StringBuffer to a set.
Most mutable object don't assume that if they happen to contain the same data they are the same. As they are mutable you can change the contents any time. i.e. it might be the same now, but not later, or it might be different now, but be the same later
BTW You shouldn't use StringBuffer if StringBuilder is an option. StringBuffer was replaced more than ten years ago.
Two StringBuffer objects are different objects despite having the same arguments. Therefore HashSet just adds the StringBuffers instead of ignoring duplicates.
StringBuffer
does not overrideObject#equals()
andObject#hashCode()
, so identity ofStringBuffer
instances is based not on the contents of the buffer, but by the object's address in memory.** That identity is based on an address in memory is not strictly required by the JLS, but is a consequence of a typical
Object#hashCode()
implementation. From the JavaDoc: