ILookup vs. IGrouping

2019-01-07 07:07发布

问题:

I've been having trouble articulating the differences between ILookup<TKey, TVal> and IGrouping<TKey, TVal>, and am curious if I understand it correctly now. LINQ compounded the issue by producing sequences of IGrouping items while also giving me a ToLookup extension method. So it felt like they were the same until I looked more closely.

var q1 = 
    from n in N
    group n by n.MyKey into g
    select g;
// q1 is IEnumerable<IGrouping<TKey, TVal>>

Which is equivalent to:

var q2 = N.GroupBy(n => n.MyKey, n => n);
// q2 is IEnumerable<IGrouping<TKey, TVal>>

Which looks a lot like:

var q3 = N.ToLookup(n => n.MyKey, n => n);
// q3 is ILookup<TKey, TVal>

Am I correct in the following analogies?

  1. An IGrouping<TKey, TVal> is a single group (i.e. a keyed sequence), analogous to KeyValuePair<TKey, TVal> where the value is actually a sequence of elements (rather than a single element)
  2. An IEnumerable<IGrouping<TKey, TVal>> is a sequence of those (similar to what you get when iterating over an IDictionary<TKey, TVal>
  3. An ILookup<TKey, TVal> is more like a IDictionary<TKey, TVal> where the value is actually a sequence of elements

回答1:

Yes, all of those are correct.

And ILookup<TKey, TValue> also extends IEnumerable<IGrouping<TKey, TValue>> so you can iterate over all the key/collection pairs as well as (or instead of) just looking up particular keys.

I basically think of ILookup<TKey,TValue> as being like IDictionary<TKey, IEnumerable<TValue>>.

Bear in mind that ToLookup is a "do it now" operation (immediate execution) whereas a GroupBy is deferred. As it happens, with the way that "pull LINQ" works, when you start pulling IGroupings from the result of a GroupBy, it has to read all the data anyway (because you can't switch group midway through) whereas in other implementations it may be able to produce a streaming result. (It does in Push LINQ; I would expect LINQ to Events to be the same.)



回答2:

There is another important difference between ILookup and IDictionary: the former enforces immutability in the sense that here are no methods for changing the data (except when the consumer performs an explicit cast). By contrast, IDictionary has methods like "Add" which allow changing the data. So, from the perspective of functional-programming and/or parallel programming, ILookup is nicer. (I only wish there was also a version of ILookup that assigns only one value to a key rather than a group.)

(Btw., it seems worth pointing out that the relationship between IEnumerable and IList is somewhat similar to the one between ILookup and IDictionary - the former is immutable, the latter is not.)



回答3:

GroupBy and ToLookUp has almost same functionality EXCEPT this: Reference

GroupBy: The GroupBy operator returns groups of elements based on some key value. Each group is represented by IGrouping object.

ToLookup: ToLookup is the same as GroupBy; the only difference is the execution of GroupBy is deferred whereas ToLookup execution is immediate.

Lets clear the difference using sample code. suppose that we have a class representing Person model:

class Personnel
{
    public int Id { get; set; }
    public string FullName { get; set; }
    public int Level { get; set; }
}

after that we define a list of personnels as below:

 var personnels = new List<Personnel>
    {
        new Personnel { Id = 1, FullName = "P1", Level = 1 },
        new Personnel { Id = 2, FullName = "P2", Level = 2 },
        new Personnel { Id = 3, FullName = "P3", Level = 1 },
        new Personnel { Id = 4, FullName = "P4", Level = 1 },
        new Personnel { Id = 5, FullName = "P5", Level =2 },
        new Personnel { Id = 6, FullName = "P6", Level = 2 },
        new Personnel { Id = 7, FullName = "P7", Level = 2 }
    };

Now I need to get the personnels grouped by their level. I have two approach here. using GroupBy or ToLookUp. If I use GroupBy, as stated before, it will use deferred execution, this means, that when you iterate through the collection the next item may or may not be computed until it is called for.

 var groups = personnels.GroupBy(p => p.Level);
    personnels.RemoveAll(p => p.Level == 1);
    foreach (var product in groups)
    {
        Console.WriteLine(product.Key);
        foreach (var item in product)
            Console.WriteLine(item.Id + " >>> " + item.FullName + " >>> " + item.Level);
    }

In the above code, I firstly grouped the personnels, but before iterating it, I removed some personnels. As GroupBy uses deferred execution, so the final result will not include the removed items, because grouping will be computing in the foreach point here.

Output:

2
2 >>> P2 >>> 2
5 >>> P5 >>> 2
6 >>> P6 >>> 2
7 >>> P7 >>> 2

But if I rewrite the above code as below:(note that code is same as the previous code except GroupBy is replaced by ToLookUp)

 var groups = personnels.ToLookup(p => p.Level);
    personnels.RemoveAll(p => p.Level == 1);
    foreach (var product in groups)
    {
        Console.WriteLine(product.Key);
        foreach (var item in product)
            Console.WriteLine(item.Id + " >>> " + item.FullName + " >>> " + item.Level);
    }

As ToLookUp uses immediate execution, it means that when I call the ToLookUp method, result is generated and group is applied, so if I remove any item from personnels prior to iteration, that wont effect the final result.

Output:

1
1 >>> P1 >>> 1
3 >>> P3 >>> 1
4 >>> P4 >>> 1
2
2 >>> P2 >>> 2
5 >>> P5 >>> 2
6 >>> P6 >>> 2
7 >>> P7 >>> 2

Note: GroupBy and ToLookUp both return different types too.

You might use ToDictionary instead of ToLookUp, but you need to pay attention to this:(reference)

The usage of ToLookup() is very similar to that of ToDictionary(), both allow you to specify key selectors, value selectors, and comparers. The main difference is that ToLookup() allows (and expects) the duplicate keys whereas ToDictionary() does not