Stream and the distinct operation

2019-01-18 08:56发布

问题:

I have the following code:

class C
{
    String n;

    C(String n)
    {
        this.n = n;
    }

    public String getN() { return n; }

    @Override
    public boolean equals(Object obj)
    {
        return this.getN().equals(((C)obj).getN());
    }
 }

List<C> cc = Arrays.asList(new C("ONE"), new C("TWO"), new C("ONE"));

System.out.println(cc.parallelStream().distinct().count());

but I don't understand why distinct returns 3 and not 2.

回答1:

You need to also override the hashCode method in class C. For example:

@Override
public int hashCode() {
    return n.hashCode();
}

When two C objects are equal, their hashCode methods must return the same value.

The API documentation for interface Stream does not mention this, but it's well-known that if you override equals, you should also override hashCode. The API documentation for Object.equals() mentions this:

Note that it is generally necessary to override the hashCode method whenever this method is overridden, so as to maintain the general contract for the hashCode method, which states that equal objects must have equal hash codes.

Apparently, Stream.distinct() indeed uses the hash code of the objects, because when you implement it like I showed above, you get the expected result: 2.