How does the compiler benefit from C++'s new f

2019-02-04 02:44发布

问题:

C++11 will allow to mark classes and virtual method to be final to prohibit deriving from them or overriding them.

class Driver {
  virtual void print() const;
};
class KeyboardDriver : public Driver {
  void print(int) const final;
};
class MouseDriver final : public Driver {
  void print(int) const;
};
class Data final {
  int values_;
};

This is very useful, because it tells the reader of the interface something about the intent of the use of this class/method. That the user gets diagnostics if he tries to override might be useful, too.

But is there an advantage from the compilers point of view? Can the compiler do anything different when he knows "this class will never be derived from" or "this virtual function will never be overridden"?

For final I mainly found only N2751 referring to it. Sifting through some of the discussions I found arguments coming from the C++/CLI side, but no clear hint why final may be useful for the compiler. I am thinking about this, because I also see some disadvantages of marking a class final: To unit-test protected member functions one can derive a class and insert test-code. Sometimes these classes are good candidates to be marked with final. This technique would be impossible in these cases.

回答1:

I can think of one scenario where it might be helpful to the compiler from an optimisation perspective. I'm not sure if it's worth the effort to compiler implementers, but it's theoretically possible at least.

With virtual call dispatch on a derived, final type you can be sure that there is nothing else deriving from that type. This means that (at least in theory) the final keyword would make it possible to correctly resolve some virtual calls at compile time, which would make a number of optimisations possible that were otherwise impossible on virtual calls.

For example, if you have delete most_derived_ptr, where most_derived_ptr is a pointer to a derived, final type then it's possible for the compiler to simplify calls to the virtual destructor.

Likewise for calls to virtual member functions on references/pointers to the most derived type.

I'd be very surprised if any compilers did this today, but it seems like the kind of thing that might be implemented over the next decade or so.

There might also be some millage in being able to infer that (in the absence of friends) things marked protected in a final class also effectively become private.



回答2:

Virtual calls to functions are slightly more costly that normal calls. In addition to actually performing the call, the runtime must first determine which function to call, which oftens leads to:

  1. Locating the v-table pointer, and through it reaching the v-table
  2. Locating the function pointer within the v-table, and through it performing the call

Compared to a direct call where the address of the function is known in advance (and hard-coded with a symbol), this leads to a small overhead. Good compilers manage to make it only 10%-15% slower than a regular call, which is usually insignificant if the function has any meat.

A compiler's optimizer still seeks to avoid all kinds of overhead, and devirtualizing function calls is generally a low-hanging fruit. For example, see in C++03:

struct Base { virtual ~Base(); };

struct Derived: Base { virtual ~Derived(); };

void foo() {
  Derived d; (void)d;
}

Clang gets:

define void @foo()() {
  ; Allocate and initialize `d`
  %d = alloca i8**, align 8
  %tmpcast = bitcast i8*** %d to %struct.Derived*
  store i8** getelementptr inbounds ([4 x i8*]* @vtable for Derived, i64 0, i64 2), i8*** %d, align 8

  ; Call `d`'s destructor
  call void @Derived::~Derived()(%struct.Derived* %tmpcast)

  ret void
}

As you can see, the compiler was already smart enough to determine that d being a Derived then it is unnecessary to incur the overhead of virtual call.

In fact, it would optimize the following function just as nicely:

void bar() {
  Base* b = new Derived();
  delete b;
}

However there are some situations where the compiler cannot reach this conclusion:

Derived* newDerived();

void deleteDerived(Derived* d) { delete d; }

Here we could expect (naively) that a call to deleteDerived(newDerived()); would result in the same code than before. However it is not the case:

define void @foobar()() {
  %1 = tail call %struct.Derived* @newDerived()()
  %2 = icmp eq %struct.Derived* %1, null
  br i1 %2, label %_Z13deleteDerivedP7Derived.exit, label %3

; <label>:3                                       ; preds = %0
  %4 = bitcast %struct.Derived* %1 to void (%struct.Derived*)***
  %5 = load void (%struct.Derived*)*** %4, align 8
  %6 = getelementptr inbounds void (%struct.Derived*)** %5, i64 1
  %7 = load void (%struct.Derived*)** %6, align 8
  tail call void %7(%struct.Derived* %1)
  br label %_Z13deleteDerivedP7Derived.exit

_Z13deleteDerivedP7Derived.exit:                  ; preds = %3, %0
  ret void
}

Convention could dictate that newDerived returns a Derived, but the compiler cannot make such an assumption: and what if it returned something further derived ? And thus you get to see all the ugly machinery involved in retrieving the v-table pointer, selecting the appropriate entry in the table and finally performing the call.

If however we put a final in, then we give the compiler a guarantee that it cannot be anything else:

define void @deleteDerived2(Derived2*)(%struct.Derived2* %d) {
  %1 = icmp eq %struct.Derived2* %d, null
  br i1 %1, label %4, label %2

; <label>:2                                       ; preds = %0
  %3 = bitcast i8* %1 to %struct.Derived2*
  tail call void @Derived2::~Derived2()(%struct.Derived2* %3)
  br label %4

; <label>:4                                      ; preds = %2, %0
  ret void
}

In short: final allows the compiler to avoid the overhead of virtual calls for the concerned functions in situations where detecting it is impossible.



回答3:

Depending how you look at it, there's a further benefit to the compiler (although that benefit simply redounds to the user, so arguably this isn't a compiler benefit): the compiler can avoid emitting warnings for constructs with uncertain behavior were a thing to be overridable.

For example, consider this code:

class Base
{
  public:
    virtual void foo() { }
    Base() { }
    ~Base();
};

void destroy(Base* b)
{
  delete b;
}

Many compilers will emit a warning for b's non-virtual destructor when the delete b is observed. If a class Derived inherited from Base and had its own ~Derived destructor, using destroy on a dynamically-allocated Derived instance would usually (per spec behavior is undefined) call ~Base, but it would not call ~Derived. Thus ~Derived's cleanup operations wouldn't happen, and that could be bad (although probably not catastrophic, in most cases).

If the compiler knows that Base can't be inherited from, however, then it's no problem that ~Base is non-virtual, because no derived cleanup can be accidentally skipped. Adding final to class Base gives the compiler the information to not emit a warning.

I know for a fact that using final in this way will suppress a warning with Clang. I don't know if other compilers emit a warning here, or if they take finality into account in determining whether or not to emit a warning.