I seem to recall reading somewhere that the cost of a virtual call in C# is not as high, relatively speaking, as in C++. Is this true? If so - why?
相关问题
- Sorting 3 numbers without branching [closed]
- Sorting 3 numbers without branching [closed]
- Graphics.DrawImage() - Throws out of memory except
- Why am I getting UnauthorizedAccessException on th
- How to compile C++ code in GDB?
The original question says:
Note the emphasis. In other words, the question might be rephrased as:
So the questioner is not claiming that C# is faster than C++ under any circumstances.
Possibly a useless diversion, but this sparked my curiosity concerning C++ with /clr:pure, using no C++/CLI extensions. The compiler produces IL that gets converted to native code by the JIT, although it is pure C++. So here we have a way of seeing what a standard C++ implementation does if running on the same platform as C#.
With a non-virtual method:
This code:
... causes the
call
opcode to be emitted with the specific method name, passing Bar an implicitthis
argument.Compare with an inheritance hierarchy:
Now if we do:
That emits the
calli
opcode instead, which jumps to a computed address - so there's a lot of IL before the call. By turning it back in to C# we can see what is going on:In other words, cast the address of
b
to a pointer to int (which happens to be the same size as a pointer) and take the value at that location, which is the address of the vtable, and then take the first item in the vtable, which is the address to jump to, dereference it and call it, passing it the implicitthis
argument.We can tweak the virtual example to use C++/CLI extensions:
This generates the
callvirt
opcode, exactly as it would in C#:So when compiling to target the CLR, Microsoft's current C++ compiler doesn't have the same possibilities for optimization as C# does when using the standard features of each language; for a standard C++ class hierarchy, the C++ compiler generates code that contains hard-coded logic for traversing the vtable, whereas for a ref class it leaves it to the JIT to figure out the optimal implementation.
It may be not exactly the answer to your question, but although .NET JIT optimizes the virtual calls as everyone said before, profile-guided optimization in Visual Studio 2005 and 2008 does virtual call speculation by inserting a direct call to the most likely targeted function, inlining the call, so the weight may be the same.
A C# virtual call has to check for “this” being null and a C++ virtual call does not. So I can’t see in generally why a C# virtual calls would be faster. In special cases the C# compiler (or JIT compiler) may be able to inline the virtual call better then a C++ compiler, as a C# compiler has access to better type information. The call method instruction may sometimes be slower in C++, as the C# JIT may be able to use a quicker instruction that only copes with a small offset as it know more about the runtime memory layout and processor model then a C++ compiler.
However we are talking about a handful of processor instruction at most here. On a modem superscalar processor, it is very possible that the “null check” instruct is run at the same time as the “call method” and therefore takes no time.
It is also very likely that all the processor instructions will already in be the level 1 cache if the call is make in a loop. But the data is less likely to be caches, the cost of reading a data value from main memory these days is the same as running 100s of instructions from the level 1 cache. Therefore it is unlucky that in real applications the cost of a virtual call is even measurable in more then a very few places.
The fact that the C# code uses a few more instructions will of course reduce the amount of code that can fit in the cache, the effect of this is impossible to predict.
(If the C++ class uses multiple inherence then the cost is more, due to having to patch up the “this” pointer. Likewise interfaces in C# add another level of redirection.)
Not sure about the full framework but in the Compact Framework it will be slower cause CF has no virtual call tables although it does cache the result. This means that a virtual call in CF will be slower the first time it is called as it has to do a manual lookup. It may be slow every time it is called if the app is low on memory as the cached lookup may be pitched.
In C# it might be possible to convert a virtual function to non-virtual by analysing the code. In practice it won't happen often enough to make much difference.
C# flattens the vtable and inlines ancestor calls so you don't chain up the inheritance hierarchy to resolve anything.