Explanation why IEnumerable is more efficient than

2020-05-12 12:34发布

问题:

I keep hearing that in .net 3.5 you should use IEnumerable over a List, but I can’t find any reference materials or articles that explain why it’s so much more proficient. Does anyone know of any content that explains this?

The purpose of asking this question is to get a better understanding of what IEnumerable is doing under the hood. If you can provide me with any links I will do the research and post an answer.

回答1:

IEnumerable<T> is an interface that is implemented by List<T>. I suspect the reason you're hearing that IEnumerable<T> should be used is because it's a less constrictive interface requirement.

For example, consider the following method signature:

void Output(List<Foo> foos) 
{ 
    foreach(var foo in foos) { /* do something */ }
}

This method requires that it be passed a concrete implementation of a List. But it's just doing something in-order. It doesn't really need random access or any of the other things that a List<T> or even an IList<T> give it. Instead, the method should accept an IEnumerable<T>:

void Output(IEnumerable<Foo> foos) 
{ 
    foreach(var foo in foos) { /* do something */ }
}

Now we're using the most general (least specific) interface that supports the operations that we need. This is a fundamental aspect of OO-design. We've decreased coupling by requiring only what we need, and not a whole lot else besides. We've also created a more flexible method because the foos parameter might be a Queue<T>, a List<T>, anything that implements IEnumerable<T>. We aren't forcing the caller to convert their data structure to a List unnecessarily.

So it isn't that IEnumerable<T> is more efficient than list in a "performance" or "runtime" aspect. It's that IEnumerable<T> is a more efficient design construct because it's a more specific indication of what your design requires. (Though this can lead to runtime gains in specific cases.)



回答2:

Enumerables have several very nice properties that you lose when converting them to a list. Namely they:

  • Use Deferred/lazy Execution
  • Are Composable
  • Are Unbounded

First I'll look at deferred execution. Pop quiz: how many times will the following code iterate over the lines in the input file?

IEnumerable<string> ReadLines(string fileName)
{
    using (var rdr = new StreamReader(fileName) )
    {
       string line;
       while ( (line = rdr.ReadLine()) != null) yield return line;
    }
}


var SearchIDs = new int[] {1234,4321, 9802};

var lines = ReadLines("SomeFile.txt")
              .Where(l => l.Length > 10 && l.StartsWith("ID: "));
              .Select(l => int.Parse(l.Substring(4).Trim()));
              .Intersect(SearchIDs);

The answer is exactly one zero. It doesn't actually do any work until you iterate over the results. You need to add this code before it even opens the file:

foreach (string line in lines) Console.WriteLine(line);

Even after the code runs, it still only ever loops over the lines once. Compare that to how many times you need to iterate over the lines in this code:

var SearchIDs = new int[] {1234,4321, 9802};
var lines = File.ReadAllLines("SomeFile.txt"); //creates a list
lines = lines.Where(l => l.Length > 10 && l.StartsWith("ID: ")).ToList();
var ids = lines.Select(l => int.Parse(l.Substring(4).Trim())).ToList();
ids = ids.Intersect(SearchIDs).ToList();

foreach (string line in lines) Console.WriteLine(line);

Even if you ignore the File.ReadAllLines() call and use the same iterator block from the first sample, the first sample would still be faster. Of course, you could write it to be just as fast using lists, but to do requires tying the code that reads the file to the code the code that parses it. And so you lose another important feature: composability.

To demonstrate composability, I'll add one the final feature — unbounded series. Consider the folllowing:

IEnumerable<int> Fibonacci()
{
   int n1 = 1, n2 = 0, n;
   yield return 1;
   while (true)
   {
        n = n1 + n2;
        yield return n;
        n2 = n1;
        n1 = n;
   }
}

This looks like it would go forever, but you can use the composability properties of IEnumerable to build something that safely gives, say, the first 50 values, or every value that's less than a given number:

  foreach (int f in Fibonacci().Take(50)) { /* ... */ }
  foreach (int f in Fibonacci().TakeWhile(i => i < 1000000) { /* ... */ }

Finally, IEnumerable is just more flexible. Unless you absolutely need the ability to append to the list or access items by index, you're almost always better writing functions to accept IEnumerables as arguments instead of Lists. Why? Because you can still pass the list to the function if you want — A list is an IEnumerable. For that matter, so is an array and many other collection types are well. So by using an IEnumerable here, you take the exact same function and make it more powerful, because it can act on more different kinds of data.



回答3:

IEnumerable<T> is not more efficient than a List<T> as a List<T> is an IEnumerable<T>.

The IEnumerable<T> interface is simply .NET's way of using the iterator pattern, nothing more.

This interface can be implemented on many types (List<T> included) to allow those types to to return iterators (i.e. instances of IEnumerator<T>) so that the caller can iterate a sequence of items.



回答4:

It isn't a question of efficiency (although that may be true) but of flexibility.

Your code becomes more re-usable if it can consume an IEnumerable instead of a List. AS to efficient consider this code:-

 function IEnumerable<int> GetDigits()
 {

    for(int i = 0; i < 10; i++)
       yield return i
 }

 function int Sum(List<int> numbers)
 {
    int result = 0; 
    foreach(int i in numbers)
      result += i;

    return i;
 }

Q: How do I take the set of numbers generated by GetDigits and get Sum to add them up?
A: I need to load the set of numbers from GetDigits into a List object and pass that to the Sum function. This uses memory as all the digits need to be loaded into memory first before they can be summed. However changing the signature of Sum to:-

 function int Sum(IEnumerable<int> numbers)

Means I can do this:-

 int sumOfDigits = Sum(GetDigits());

No list is loaded into memory I only need storage for the current digit and the accumulator variable in sum.



回答5:

These are two different beasts and you cannot really compare them. For instance, in var q = from x in ... the q is IEnumerable, but under the hood it executes a very expensive database call.

An IEnumerable is just an interface for an Iterator design pattern, whereas List/IList is a data container.



回答6:

One reason that it is recommended to have methods return IEnumerable<T> is that it is less specific than List<T>. This means that you can later change the internals of your method to use something that may be more efficient for the needs of it, and as long as it is an IEnumerable<T> you don't need to touch the contract of your method.



回答7:

In .NET 3.5, using IEnumerable allows you to write methods with deferred execution, such as the following:

public class MyClass
{
   private List<int> _listOne;
   private List<int> _listTwo;
public IEnumerable<int> GetItems () { foreach (int n in _listOne) { yield return n; } foreach (int n in _listTwo) { yield return n; } } }

This allows you to combine the two lists without creating a new List<int> object.