Performance impact of changing to generic interfac

2019-04-28 11:28发布

站内文章 / C#

46 0

戒情不戒烟

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I work on applications developed in C#/.NET with Visual Studio. Very often ReSharper, in the prototypes of my methods, advises me to replace the type of my input parameters with more generic ones. For instance, List<> with IEnumerable<> if I only use the list with a foreach in the body of my method. I can understand why it looks smarter to write that but I'm quite concerned with the performance. I fear that the performance of my apps will decrease if I listen to ReSharper...

Can someone explain to me precisely (more or less) what's happening behind the scenes (i.e. in the CLR) when I write:

public void myMethod(IEnumerable<string> list)
{
  foreach (string s in list)
  {
    Console.WriteLine(s);
  }
}

static void Main()
{
  List<string> list = new List<string>(new string[] {"a", "b", "c"});
  myMethod(list);
}

and what is the difference with:

public void myMethod(List<string> list)
{
  foreach (string s in list)
  {
    Console.WriteLine(s);
  }
}

static void Main()
{
  List<string> list = new List<string>(new string[] {"a", "b", "c"});
  myMethod(list);
}

回答1:

You're worried about performance - but do you have any grounds for that concern? My guess is that you haven't benchmarked the code at all. Always benchmark before replacing readable, clean code with more performant code.

In this case the call to Console.WriteLine will utterly dominate the performance anyway.

While I suspect there may be a theoretical difference in performance between using List<T> and IEnumerable<T> here, I suspect the number of cases where it's significant in real world apps is vanishingly small.

It's not even as if the sequence type is being used for many operations - there's a single call to GetEnumerator() which is declared to return IEnumerator<T> anyway. As the list gets larger, any difference in performance between the two will get even smaller, because it will only have any impact at all at the very start of the loop.

Ignoring the analysis though, the thing to take out of this is to measure performance before you base coding decisions on it.

As for what happens behind the scenes - you'd have to dig into the deep details of exactly what's in the metadata in each case. I suspect that in the case of an interface there's one extra level of redirection, at least in theory - the CLR would have to work out where in the target object's type the vtable for IEnumerable<T> was, and then call into the appropriate method's code. In the case of List<T>, the JIT would know the right offset into the vtable to start with, without the extra lookup. This is just based on my somewhat hazy understanding of JITting, thunking, vtables and how they apply to interfaces. It may well be slightly wrong, but more importantly it's an implementation detail.

回答2:

You'd have to look at the generated code to be certain, but in this case, I doubt there's much difference. The foreach statement always operates on an IEnumerable or IEnumerable<T>. Even if you specify List<T>, it will still have to get the IEnumerable<T> in order to iterate.

回答3:

In general, I'd say if you are replace the equivalent non-generic interface by the generic flavour (say IList<> --> IList<T>) you are bound to get better or equivalent performance.

One unique selling point is that because, unlike java, .NET does not use type erasure and supports true value types (struct), one of the main differences would be in how it stores e.g. a List<int> internally. This could quite quickly become a big difference depending on how intensively the List is being used.

A braindead synthetic benchmark showed:

    for (int j=0; j<1000; j++)
    {
        List<int> list = new List<int>();
        for (int i = 1<<12; i>0; i--)
            list.Add(i);

        list.Sort();
    }

to be faster by a factor of 3.2x than the semi-equivalent non-generic:

    for (int j=0; j<1000; j++)
    {
        ArrayList list = new ArrayList();
        for (int i = 1<<12; i>0; i--)
            list.Add(i);

        list.Sort();
    }

Disclaimer I realize this benchmark is synthetic, it doesn't actually focus on the use of interfaces right there (rather directly dispatches virtual methods calls on a specific type) etc. However, it illustrates the point I'm making. Don't fear generics (at least not for performance reasons).

回答4:

In general, the increased flexibility will be worth what minor performance difference it would incur.

回答5:

In the first version (IEnumerable) it is more generic and actually you say the method accepts any argument that implements this interface.

Second version yo restrict the method to accept sepcific class type and this is not recommended at all. And the performance is mostly the same.

回答6:

The basic reason for this recommendation is creating a method that works on IEnumberable vs. List is future flexibility. If in the future you need to create a MySpecialStringsCollection, you could have it implement the IEnumerable method and still utilize the same method.

Essentially, I think it comes down, unless you're noticing a significant, meaningful performance hit (and I'd be shocked if you noticed any); prefer a more tolerant interface, that will accept more than what you're expecting today.

回答7:

The definition for List<T> is:

[SerializableAttribute]
public class List<T> : IList<T>, ICollection<T>, 
    IEnumerable<T>, IList, ICollection, IEnumerable

So List<T> is derived from IList, ICollection, IList<T>, and ICollection<T>, in addition to IEnumerable and IEnumerable<T>.

The IEnumerable interface exposes the GetEnumerator method which returns an IEnumerator, a MoveNext method, and a Current property. These mechanisms are what the List<T> class uses to iterate through the list with foreach and next.

It follows that, if IList, ICollection, IList<T>, and ICollection<T> are not required to do the job, then it's sensible to use IEnumerable or IEnumerable<T> instead, thereby eliminating the additional plumbing.

回答8:

An interface simply defines the presence and signature of public methods and properties implemented by the class. Since the interface does not "stand on its own", there should be no performance difference for the method itself, and any "casting" penalty - if any - should be almost too small to measure.

回答9:

There is no performance penalty for a static-upcast. It's a logical construct in program text.

As other people have said, premature optimization is the root of all evil. Write your code, run it through a hotspot analysis before you worry about performance tuning things.

回答10:

Getting in IEnumerable<> might create some trouble, as you could receive some LINQ expression with differed execution, or yield return. In both cases you won't have a collection but something you could iterate on. So when you would like to set some boundaries, you could request an array. There is not a problem to call collection.ToArray() before passing parameter, but you'll be sure that there is no hidden differed caveats there.