C# Distinct on IEnumerable with custom IEqualit

2019-01-16 10:16发布

问题:

Here's what I'm trying to do. I'm querying an XML file using LINQ to XML, which gives me an IEnumerable<T> object, where T is my "Village" class, filled with the results of this query. Some results are duplicated, so I would like to perform a Distinct() on the IEnumerable object, like so:

public IEnumerable<Village> GetAllAlliances()
{
    try
    {
        IEnumerable<Village> alliances =
             from alliance in xmlDoc.Elements("Village")
             where alliance.Element("AllianceName").Value != String.Empty
             orderby alliance.Element("AllianceName").Value
             select new Village
             {
                 AllianceName = alliance.Element("AllianceName").Value
             };

        // TODO: make it work...
        return alliances.Distinct(new AllianceComparer());
    }
    catch (Exception ex)
    {
        throw new Exception("GetAllAlliances", ex);
    }
}

As the default comparer would not work for the Village object, I implemented a custom one, as seen here in the AllianceComparer class:

public class AllianceComparer : IEqualityComparer<Village>
{
    #region IEqualityComparer<Village> Members
    bool IEqualityComparer<Village>.Equals(Village x, Village y)
    {
        // Check whether the compared objects reference the same data.
        if (Object.ReferenceEquals(x, y)) 
            return true;

        // Check whether any of the compared objects is null.
        if (Object.ReferenceEquals(x, null) || Object.ReferenceEquals(y, null))
            return false;

        return x.AllianceName == y.AllianceName;
    }

    int IEqualityComparer<Village>.GetHashCode(Village obj)
    {
        return obj.GetHashCode();
    }
    #endregion
}

The Distinct() method doesn't work, as I have exactly the same number of results with or without it. Another thing, and I don't know if it's usually possible, but I cannot step into AllianceComparer.Equals() to see what could be the problem.
I've found examples of this on the Internet, but I can't seem to make my implementation work.

Hopefully, someone here might see what could be wrong here! Thanks in advance!

回答1:

The problem is with your GetHashCode. You should alter it to return the hash code of AllianceName instead.

int IEqualityComparer<Village>.GetHashCode(Village obj)
{
    return obj.AllianceName.GetHashCode();
}

The thing is, if Equals returns true, the objects should have the same hash code which is not the case for different Village objects with same AllianceName. Since Distinct works by building a hash table internally, you'll end up with equal objects that won't be matched at all due to different hash codes.

Similarly, to compare two files, if the hash of two files are not the same, you don't need to check the files themselves at all. They will be different. Otherwise, you'll continue to check to see if they are really the same or not. That's exactly what the hash table that Distinct uses behaves.



回答2:

return alliances.Select(v => v.AllianceName).Distinct();

That would return an IEnumerable<string> instead of IEnumerable<Village>.



回答3:

Or change the line

return alliances.Distinct(new AllianceComparer());

to

return alliances.Select(v => v.AllianceName).Distinct();