Virtual functions and performance - C++

2019-01-02 19:18发布

In my class design, I use abstract classes and virtual functions extensively. I had a feeling that virtual functions affects the performance. Is this true? But I think this performance difference is not noticeable and looks like I am doing premature optimization. Right?

15条回答
高级女魔头
2楼-- · 2019-01-02 19:57

In my experience, the main relevant thing is the ability to inline a function. If you have performance/optimization needs that dictate a function needs to be inlined, then you can't make the function virtual because it would prevent that. Otherwise, you probably won't notice the difference.

查看更多
裙下三千臣
3楼-- · 2019-01-02 19:58

I've gone back and forth on this at least 20 times on my particular project. Although there can be some great gains in terms of code reuse, clarity, maintainability, and readability, on the other hand, performance hits still do exist with virtual functions.

Is the performance hit going to be noticeable on a modern laptop/desktop/tablet... probably not! However, in certain cases with embedded systems, the performance hit may be the driving factor in your code's inefficiency, especially if the virtual function is called over and over again in a loop.

Here's a some-what dated paper that anaylzes best practices for C/C++ in the embedded systems context: http://www.open-std.org/jtc1/sc22/wg21/docs/ESC_Boston_01_304_paper.pdf

To conclude: it's up to the programmer to understand the pros/cons of using a certain construct over another. Unless you're super performance driven, you probably don't care about the performance hit and should use all the neat OO stuff in C++ to help make your code as usable as possible.

查看更多
明月照影归
4楼-- · 2019-01-02 19:59

The performance penalty of using virtual functions can never outweight the advantages you get at the design level. Supposedly a call to a virtual function would be 25% less efficient then a direct call to a static function. This is because there is a level of indirection throught the VMT. However the time taken to make the call is normally very small compared to the time taken in the actual execution of your function so the total performance cost will be nigligable, especially with current performance of hardware. Furthermore the compiler can sometimes optimise and see that no virtual call is needed and compile it into a static call. So don't worry use virtual functions and abstract classes as much as you need.

查看更多
爱死公子算了
5楼-- · 2019-01-02 20:01

One thing to note is that this:

boolean contains(A element) {
    for (A current: this)
        if (element.equals(current))
            return true;
    return false;
}

may be faster than this:

boolean contains(A element) {
    for (A current: this)
        if (current.equals(equals))
            return true;
    return false;
}

This is because the first method is only calling one function while the second may be calling many different functions. This applies to any virtual function in any language.

I say "may" because this depends on the compiler, the cache etc.

查看更多
宁负流年不负卿
6楼-- · 2019-01-02 20:01

I always questioned myself this, especially since - quite a few years ago - I also did such a test comparing the timings of a standard member method call with a virtual one and was really angry about the results at that time, having empty virtual calls being 8 times slower than non-virtuals.

Today I had to decide whether or not to use a virtual function for allocating more memory in my buffer class, in a very performance critical app, so I googled (and found you), and in the end, did the test again.

// g++ -std=c++0x -o perf perf.cpp -lrt
#include <typeinfo>    // typeid
#include <cstdio>      // printf
#include <cstdlib>     // atoll
#include <ctime>       // clock_gettime

struct Virtual { virtual int call() { return 42; } }; 
struct Inline { inline int call() { return 42; } }; 
struct Normal { int call(); };
int Normal::call() { return 42; }

template<typename T>
void test(unsigned long long count) {
    std::printf("Timing function calls of '%s' %llu times ...\n", typeid(T).name(), count);

    timespec t0, t1;
    clock_gettime(CLOCK_REALTIME, &t0);

    T test;
    while (count--) test.call();

    clock_gettime(CLOCK_REALTIME, &t1);
    t1.tv_sec -= t0.tv_sec;
    t1.tv_nsec = t1.tv_nsec > t0.tv_nsec
        ? t1.tv_nsec - t0.tv_nsec
        : 1000000000lu - t0.tv_nsec;

    std::printf(" -- result: %d sec %ld nsec\n", t1.tv_sec, t1.tv_nsec);
}

template<typename T, typename Ua, typename... Un>
void test(unsigned long long count) {
    test<T>(count);
    test<Ua, Un...>(count);
}

int main(int argc, const char* argv[]) {
    test<Inline, Normal, Virtual>(argc == 2 ? atoll(argv[1]) : 10000000000llu);
    return 0;
}

And was really surprised that it - in fact - really does not matter at all anymore. While it makes just sense to have inlines faster than non-virtuals, and them being faster then virtuals, it often comes to the load of the computer overall, whether your cache has the necessary data or not, and whilst you might be able to optimize at cache-level, I think, that this should be done by the compiler developers more than by application devs.

查看更多
栀子花@的思念
7楼-- · 2019-01-02 20:03

There's another performance criteria besides execution time. A Vtable takes up memory space as well, and in some cases can be avoided: ATL uses compile-time "simulated dynamic binding" with templates to get the effect of "static polymorphism", which is sort of hard to explain; you basically pass the derived class as a parameter to a base class template, so at compile time the base class "knows" what its derived class is in each instance. Won't let you store multiple different derived classes in a collection of base types (that's run-time polymorphism) but from a static sense, if you want to make a class Y that is the same as a preexisting template class X which has the hooks for this kind of overriding, you just need to override the methods you care about, and then you get the base methods of class X without having to have a vtable.

In classes with large memory footprints, the cost of a single vtable pointer is not much, but some of the ATL classes in COM are very small, and it's worth the vtable savings if the run-time polymorphism case is never going to occur.

See also this other SO question.

By the way here's a posting I found that talks about the CPU-time performance aspects.

查看更多
登录 后发表回答