Several Linq.Enumerable functions take an IEqualityComparer<T>
. Is there a convenient wrapper class that adapts a delegate(T,T)=>bool
to implement IEqualityComparer<T>
? It's easy enough to write one (if your ignore problems with defining a correct hashcode), but I'd like to know if there is an out-of-the-box solution.
Specifically, I want to do set operations on Dictionary
s, using only the Keys to define membership (while retaining the values according to different rules).
Just one optimization: We can use the out-of-the-box EqualityComparer for value comparisions, rather than delegating it.
This would also make the implementation cleaner as actual comparision logic now stays in GetHashCode() and Equals() which you may have already overloaded.
Here is the code:
Don't forget to overload GetHashCode() and Equals() methods on your object.
This post helped me: c# compare two generic values
Sushil
This makes it possible to select a property with lambda like this:
.Select(y => y.Article).Distinct(x => x.ArticleID);
When you want to customize equality checking, 99% of the time you're interested in defining the keys to compare by, not the comparison itself.
This could be an elegant solution (concept from Python's list sort method).
Usage:
The
KeyEqualityComparer
class:I'm afraid there is no such wrapper out-of-box. However it's not hard to create one:
On the importance of
GetHashCode
Others have already commented on the fact that any custom
IEqualityComparer<T>
implementation should really include aGetHashCode
method; but nobody's bothered to explain why in any detail.Here's why. Your question specifically mentions the LINQ extension methods; nearly all of these rely on hash codes to work properly, because they utilize hash tables internally for efficiency.
Take
Distinct
, for example. Consider the implications of this extension method if all it utilized were anEquals
method. How do you determine whether an item's already been scanned in a sequence if you only haveEquals
? You enumerate over the entire collection of values you've already looked at and check for a match. This would result inDistinct
using a worst-case O(N2) algorithm instead of an O(N) one!Fortunately, this isn't the case.
Distinct
doesn't just useEquals
; it usesGetHashCode
as well. In fact, it absolutely does not work properly without anIEqualityComparer<T>
that supplies a properGetHashCode
. Below is a contrived example illustrating this.Say I have the following type:
Now say I have a
List<Value>
and I want to find all of the elements with a distinct name. This is a perfect use case forDistinct
using a custom equality comparer. So let's use theComparer<T>
class from Aku's answer:Now, if we have a bunch of
Value
elements with the sameName
property, they should all collapse into one value returned byDistinct
, right? Let's see...Output:
Hmm, that didn't work, did it?
What about
GroupBy
? Let's try that:Output:
Again: didn't work.
If you think about it, it would make sense for
Distinct
to use aHashSet<T>
(or equivalent) internally, and forGroupBy
to use something like aDictionary<TKey, List<T>>
internally. Could this explain why these methods don't work? Let's try this:Output:
Yeah... starting to make sense?
Hopefully from these examples it's clear why including an appropriate
GetHashCode
in anyIEqualityComparer<T>
implementation is so important.Original answer
Expanding on orip's answer:
There are a couple of improvements that can be made here.
Func<T, TKey>
instead ofFunc<T, object>
; this will prevent boxing of value type keys in the actualkeyExtractor
itself.where TKey : IEquatable<TKey>
constraint; this will prevent boxing in theEquals
call (object.Equals
takes anobject
parameter; you need anIEquatable<TKey>
implementation to take aTKey
parameter without boxing it). Clearly this may pose too severe a restriction, so you could make a base class without the constraint and a derived class with it.Here's what the resulting code might look like:
I don't know of an existing class but something like:
Note: I haven't actually compiled and run this yet, so there might be a typo or other bug.