可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
It just happens to me about one code design question. Say, I have one "template" method that invokes some functions that may "alter". A intuitive design is to follow "Template Design Pattern". Define the altering functions to be "virtual" functions to be overridden in subclasses. Or, I can just use delegate functions without "virtual". The delegate functions is injected so that they can be customized too.
Originally, I thought the second "delegate" way would be faster than "virtual" way, but some coding snippet proves it is not correct.
In below code, the first DoSomething method follows "template pattern". It calls on the virtual method IsTokenChar. The second DoSomthing method doesn't depend on virtual function. Instead, it has a pass-in delegate. In my computer, the first DoSomthing is always faster than the second. The result is like 1645:1780.
"Virtual invocation" is dynamic binding and should be more time-costing than direct delegation invocation, right? but the result shows it is not.
Anybody can explain this?
using System;
using System.Diagnostics;
class Foo
{
public virtual bool IsTokenChar(string word)
{
return String.IsNullOrEmpty(word);
}
// this is a template method
public int DoSomething(string word)
{
int trueCount = 0;
for (int i = 0; i < repeat; ++i)
{
if (IsTokenChar(word))
{
++trueCount;
}
}
return trueCount;
}
public int DoSomething(Predicate<string> predicator, string word)
{
int trueCount = 0;
for (int i = 0; i < repeat; ++i)
{
if (predicator(word))
{
++trueCount;
}
}
return trueCount;
}
private int repeat = 200000000;
}
class Program
{
static void Main(string[] args)
{
Foo f = new Foo();
{
Stopwatch sw = Stopwatch.StartNew();
f.DoSomething(null);
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
}
{
Stopwatch sw = Stopwatch.StartNew();
f.DoSomething(str => String.IsNullOrEmpty(str), null);
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
}
}
}
回答1:
Think about what's required in each case:
Virtual call
- Check for nullity
- Navigate from object pointer to type pointer
- Look up method address in instruction table
- (Not sure - even Richter doesn't cover this) Go to base type if method isn't overridden? Recurse until we find the right method address. (I don't think so - see edit at bottom.)
- Push original object pointer onto stack ("this")
- Call method
Delegate call
- Check for nullity
- Navigate from object pointer to array of invocations (all delegates are potentially multicast)
- Loop over array, and for each invocation:
- Fetch method address
- Work out whether or not to pass the target as first argument
- Push arguments onto stack (may have been done already - not sure)
- Optionally (depending on whether the invocation is open or closed) push the invocation target onto the stack
- Call method
There may be some optimisation so that there's no looping involved in the single-call case, but even so that will take a very quick check.
But basically there's just as much indirection involved with a delegate. Given the bit I'm unsure of in the virtual method call, it's possible that a call to an unoverridden virtual method in a massively deep type hierarchy would be slower... I'll give it a try and edit with the answer.
EDIT: I've tried playing around with both the depth of inheritance hierarchy (up to 20 levels), the point of "most derived overriding" and the declared variable type - and none of them seems to make a difference.
EDIT: I've just tried the original program using an interface (which is passed in) - that ends up having about the same performance as the delegate.
回答2:
Just wanted to add a few corrections to john skeet's response:
A virtual method call does not need to do a null check (automatically handled with hardware traps).
It also does not need to walk up inheritance chain to find non-overriden methods (that's what the virtual method table is for).
A virtual method call is essentially one extra level of indirection when invoking. It is slower than a normal call because of the table look-up and subsequent function pointer call.
A delegate call also involves an extra level of indirection.
Calls to a delegate do not involve putting arguments in an array unless you are performing a dynamic invoke using the DynamicInvoke method.
A delegate call involves the calling method calling a compiler generated Invoke method on the delegate type in question. A call to predicator(value) is turned into predicator.Invoke(value).
The Invoke method in turn is implemented by the JIT to call the function pointer(s) (stored internally in the delegate object).
In your example, the delegate you passed should have been implemented as a compiler generated static method as the implementation does not access any instance variables or locals so therefore the need to access the "this" pointer from the heap should not be an issue.
The performance difference between delegate and virtual function calls should be mostly the same and your performance tests show that they are very close.
The difference could be due to the need to additional checks+branches because of multicast (as suggested by John). Another reason could be that the JIT compiler does not inline the Delegate.Invoke method and the implementation of Delegate.Invoke does not handle arguments as well as the implementation when performming virtual method calls.
回答3:
A virtual call is dereferencing two pointers at a well-known offset in the memory. It's not actually dynamic binding; there is no code at runtime to reflect over the metadata to discover the right method. The compiler generates couple of instructions to do the call, based on the this pointer. in fact, the virtual call is a single IL instruction.
A predicate call is creating an anonymous class to encapsulate the predicate. That class has to be instantiated and there is some code generated to actually check whether the predicate function pointer is null or not.
I would suggest you look at the IL constructs for both. Compile a simplified version of your source above with a single call to each of the two DoSomthing. Then use ILDASM to see what is the actual code for each pattern.
(And I am sure I'll get downvoted for not using the right terminology :-))
回答4:
Test result worth 1000 of words: http://kennethxu.blogspot.com/2009/05/strong-typed-high-performance_15.html
回答5:
It is possible that since you don't have any methods that override the virtual method that the JIT is able to recognize this and use a direct call instead.
For something like this it's generally better to test it out as you have done than try to guess what the performance will be. If you want to know more about how delegate invocation works, I suggest the excellent book "CLR Via C#" by Jeffrey Richter.
回答6:
I doubt it accounts for all of your difference, but one thing off the top of my head that may account for some of the difference is that virtual method dispatch already has the this
pointer ready to go. When calling through a delegate the this
pointer has to be fetched from the delegate.
Note that according to this blog article the difference was even greater in .NET v1.x.
回答7:
virtual overrides have some sort of redirection table or something which is hardcoded and fully optimized at compile time. It's set in stone, very fast.
Delegates are dynamic which will always have an overhead and they seem to be objects too so that adds up.
You shouldn't worry about these small performance differences (unless developing performance critical software for the military), for most purposes good code structure wins over optimization.