LINQ: Use .Except() on collections of different ty

2019-04-07 21:55发布

问题:

Given two lists of different types, is it possible to make those types convertible between or comparable to each other (eg with a TypeConverter or similar) so that a LINQ query can compare them? I've seen other similar questions on SO but nothing that points to making the types convertible between each other to solve the problem.

Collection Types:

public class Data
{
    public int ID { get; set; }
}

public class ViewModel
{
    private Data _data;

    public ViewModel(Data data)
    {
        _data = data;
    }
}

Desired usage:

    public void DoMerge(ObservableCollection<ViewModel> destination, IEnumerable<Data> data)
    {
        // 1. Find items in data that don't already exist in destination
        var newData = destination.Except(data);

        // ...
    }

It would seem logical that since I know how to compare an instance of ViewModel to an instance of Data I should be able to provide some comparison logic that LINQ would then use for queries like .Except(). Is this possible?

回答1:

Your best bet is to provide a projection from Data to ViewModel so that you can say

var newData = destination.Except(data.Select(x => f(x)));

where f maps Data to ViewModel. You will need a IEqualityComparer<Data> too.



回答2:

I assume that providing a projection from Data to ViewModel is problematic, so I'm offering another solution in addition to Jason's.

Except uses a hash set (if I recall correctly), so you can get similar performance by creating your own hashset. I'm also assuming that you are identifying Data objects as equal when their IDs are equal.

var oldIDs = new HashSet<int>(data.Select(d => d.ID));
var newData = destination.Where(vm => !oldIDs.Contains(vm.Data.ID));

You might have another use for a collection of "oldData" elsewhere in the method, in which case, you would want to do this instead. Either implement IEquatable<Data> on your data class, or create a custom IEqualityComparer<Data> for the hash set:

var oldData = new HashSet<Data>(data);
//or: var oldData = new HashSet<Data>(data, new DataEqualityComparer());
var newData = destination.Where(vm => !oldData.Contains(vm.Data));


回答3:

If you use this :

var newData = destination.Except(data.Select(x => f(x)));

You have to project 'data' to same type contained in 'destination', but using the code below you could get rid of this limitation :

//Here is how you can compare two different sets.
class A { public string Bar { get; set; } }
class B { public string Foo { get; set; } }

IEnumerable<A> setOfA = new A[] { /*...*/ };
IEnumerable<B> setOfB = new B[] { /*...*/ };
var subSetOfA1 = setOfA.Except(setOfB, a => a.Bar, b => b.Foo);

//alternatively you can do it with a custom EqualityComparer, if your not case sensitive for instance.
var subSetOfA2 = setOfA.Except(setOfB, a => a.Bar, b => b.Foo, StringComparer.OrdinalIgnoreCase);

//Here is the extension class definition allowing you to use the code above
public static class IEnumerableExtension
{
    public static IEnumerable<TFirst> Except<TFirst, TSecond, TCompared>(
        this IEnumerable<TFirst> first,
        IEnumerable<TSecond> second,
        Func<TFirst, TCompared> firstSelect,
        Func<TSecond, TCompared> secondSelect)
    {
        return Except(first, second, firstSelect, secondSelect, EqualityComparer<TCompared>.Default);
    }

    public static IEnumerable<TFirst> Except<TFirst, TSecond, TCompared>(
        this IEnumerable<TFirst> first,
        IEnumerable<TSecond> second,
        Func<TFirst, TCompared> firstSelect,
        Func<TSecond, TCompared> secondSelect,
        IEqualityComparer<TCompared> comparer)
    {
        if (first == null)
            throw new ArgumentNullException("first");
        if (second == null)
            throw new ArgumentNullException("second");
        return ExceptIterator<TFirst, TSecond, TCompared>(first, second, firstSelect, secondSelect, comparer);
    }

    private static IEnumerable<TFirst> ExceptIterator<TFirst, TSecond, TCompared>(
        IEnumerable<TFirst> first,
        IEnumerable<TSecond> second,
        Func<TFirst, TCompared> firstSelect,
        Func<TSecond, TCompared> secondSelect,
        IEqualityComparer<TCompared> comparer)
    {
        HashSet<TCompared> set = new HashSet<TCompared>(second.Select(secondSelect), comparer);
        foreach (TFirst tSource1 in first)
            if (set.Add(firstSelect(tSource1)))
                yield return tSource1;
    }
}

Some may argue that's memory inefficient due to the use of an HashSet. But actually the Enumerable.Except method of the framework is doing the same with a similar internal class called 'Set' (I took a look by decompiling).