Handling warning for possible multiple enumeration

2019-01-02 14:39发布

问题:

In my code in need to use an IEnumerable<> several times thus get the Resharper error of "Possible multiple enumeration of IEnumerable".

Sample code:

public List<object> Foo(IEnumerable<object> objects)
{
    if (objects == null || !objects.Any())
        throw new ArgumentException();

    var firstObject = objects.First();
    var list = DoSomeThing(firstObject);        
    var secondList = DoSomeThingElse(objects);
    list.AddRange(secondList);

    return list;
}
  • I can change the objects parameter to be List and then avoid the possible multiple enumeration but then I don't get the highest object that I can handle.
  • Another thing that I can do is to convert the IEnumerable to List at the beginning of the method:

 public List<object> Foo(IEnumerable<object> objects)
 {
    var objectList = objects.ToList();
    // ...
 }

But this is just awkward.

What would you do in this scenario?

回答1:

The problem with taking IEnumerable as a parameter is that it tells callers "I wish to enumerate this". It doesn't tell them how many times you wish to enumerate.

I can change the objects parameter to be List and then avoid the possible multiple enumeration but then I don't get the highest object that I can handle.

The goal of taking the highest object is noble, but it leaves room for too many assumptions. Do you really want someone to pass a LINQ to SQL query to this method, only for you to enumerate it twice (getting potentially different results each time?)

The semantic missing here is that a caller, who perhaps doesn't take time to read the details of the method, may assume you only iterate once - so they pass you an expensive object. Your method signature doesn't indicate either way.

By changing the method signature to IList/ICollection, you will at least make it clearer to the caller what your expectations are, and they can avoid costly mistakes.

Otherwise, most developers looking at the method might assume you only iterate once. If taking an IEnumerable is so important, you should consider doing the .ToList() at the start of the method.

It's a shame .NET doesn't have an interface that is IEnumerable + Count + Indexer, without Add/Remove etc. methods, which is what I suspect would solve this problem.



回答2:

If your data is always going to be repeatable, perhaps don't worry about it. However, you can unroll it too - this is especially useful if the incoming data could be large (for example, reading from disk/network):

if(objects == null) throw new ArgumentException();
using(var iter = objects.GetEnumerator()) {
    if(!iter.MoveNext()) throw new ArgumentException();

    var firstObject = iter.Current;
    var list = DoSomeThing(firstObject);  

    while(iter.MoveNext()) {
        list.Add(DoSomeThingElse(iter.Current));
    }
    return list;
}

Note I changed the semantic of DoSomethingElse a bit, but this is mainly to show unrolled usage. You could re-wrap the iterator, for example. You could make it an iterator block too, which could be nice; then there is no list - and you would yield return the items as you get them, rather than add to a list to be returned.



回答3:

If the aim is really to prevent multiple enumerations than the answer by Marc Gravell is the one to read, but maintaining the same semantics you could simple remove the redundant Any and First calls and go with:

public List<object> Foo(IEnumerable<object> objects)
{
    if (objects == null)
        throw new ArgumentNullException("objects");

    var first = objects.FirstOrDefault();

    if (first == null)
        throw new ArgumentException(
            "Empty enumerable not supported.", 
            "objects");

    var list = DoSomeThing(first);  

    var secondList = DoSomeThingElse(objects);

    list.AddRange(secondList);

    return list;
}

Note, that this assumes that you IEnumerable is not generic or at least is constrained to be a reference type.



回答4:

I usually overload my method with IEnumerable and IList in this situation.

public static IEnumerable<T> Method<T>( this IList<T> source ){... }

public static IEnumerable<T> Method<T>( this IEnumerable<T> source )
{
    /*input checks on source parameter here*/
    return Method( source.ToList() );
}

I take care to explain in the summary comments of the methods that calling IEnumerable will perform a .ToList().

The programmer can choose to .ToList() at a higher level if multiple operations are being concatenated and then call the IList overload or let my IEnumerable overload take care of that.



回答5:

Using IReadOnlyCollection<T> or IReadOnlyList<T> in the method signature instead of IEnumerable<T>, has the advantage of making explicit that you might need to check the count before iterating, or to iterate multiple times for some other reason.

However they have a huge downside that will cause problems if you try to refactor your code to use interfaces, for instance to make it more testable and friendly to dynamic proxying. The key point is that IList<T> does not inherit from IReadOnlyList<T>, and similarly for other collections and their respective read-only interfaces. (In short, this is because .NET 4.5 wanted to keep ABI compatibility with earlier versions. But they didn't even take the opportunity to change that in .NET core.)

This means that if you get an IList<T> from some part of the program and want to pass it to another part that expects an IReadOnlyList<T>, you can't! You can however pass an IList<T> as an IEnumerable<T>.

In the end, IEnumerable<T> is the only read-only interface supported by all .NET collections including all collection interfaces. Any other alternative will come back to bite you as you realize that you locked yourself out from some architecture choices. So I think it's the proper type to use in function signatures to express that you just want a read-only collection.

(Note that you can always write a IReadOnlyList<T> ToReadOnly<T>(this IList<T> list) extension method that simple casts if the underlying type supports both interfaces, but you have to add it manually everywhere when refactoring, where as IEnumerable<T> is always compatible.)

As always this is not an absolute, if you're writing database-heavy code where accidental multiple enumeration would be a disaster, you might prefer a different trade-off.



回答6:

If you only need to check the first element you can peek on it without iterating the whole collection:

public List<object> Foo(IEnumerable<object> objects)
{
    object firstObject;
    if (objects == null || !TryPeek(ref objects, out firstObject))
        throw new ArgumentException();

    var list = DoSomeThing(firstObject);
    var secondList = DoSomeThingElse(objects);
    list.AddRange(secondList);

    return list;
}

public static bool TryPeek<T>(ref IEnumerable<T> source, out T first)
{
    if (source == null)
        throw new ArgumentNullException(nameof(source));

    IEnumerator<T> enumerator = source.GetEnumerator();
    if (!enumerator.MoveNext())
    {
        first = default(T);
        source = Enumerable.Empty<T>();
        return false;
    }

    first = enumerator.Current;
    T firstElement = first;
    source = Iterate();
    return true;

    IEnumerable<T> Iterate()
    {
        yield return firstElement;
        using (enumerator)
        {
            while (enumerator.MoveNext())
            {
                yield return enumerator.Current;
            }
        }
    }
}


回答7:

First off, that warning does not always mean so much. I usually disabled it after making sure it's not a performance bottle neck. It just means the IEnumerable is evaluated twice, wich is usually not a problem unless the evaluation itself takes a long time. Even if it does take a long time, in this case your only using one element the first time around.

In this scenario you could also exploit the powerful linq extension methods even more.

var firstObject = objects.First();
return DoSomeThing(firstObject).Concat(DoSomeThingElse(objects).ToList();

It is possible to only evaluate the IEnumerable once in this case with some hassle, but profile first and see if it's really a problem.



标签: