Inlined functions have a non-inlined copy

2020-03-26 06:55发布

问题:

In Agner Fog's Optimizing C++ manual he has a section "Inlined functions have a non-inlined copy" where he writes

Function inlining has the complication that the same function may be called from another module. The compiler has to make a non-inlined copy of the inlined function for the sake of the possibility that the function is also called from another module. This non-inlined copy is dead code if no other modules call the function. This fragmentation of the code makes caching less efficient.

Let's make a test for this.

foo.h

inline double foo(double x) {
    return x;
}

t1.cpp

#include "foo.h"
double t1(double x) {
    return foo(x);
}

main.cpp

#include <stdio.h>
extern double foo(double);

int main(void) {
    printf("%f\n", foo(3.14159));
}

compile with g++ t1.cpp main.cpp and it runs correctly. If I do g++ -S t1.cpp main.cpp and look at the assembly I see that main.s calls a function defined in t1.s. Doing g++ -c main.cpp and g++ t1.cpp and looking at the symbols with nm shows U _Z3food in main.o and W _Z3food in t1.o. So it's clear that Agner's claim of there being a non-inlined copy is correct.

What about with g++ -O1 t1.cpp main.cpp? This fails to compile due to foo being undefined. Doing g++ -O1 t1.cpp and nm t1.o shows that _Z3food has been stripped out.

Now I am confused. I did not expect g++ to remove the non-inline copy with optimization enabled.

It seems that with optimization enabled inline is equivalent to static inline. But without optimization inline means there is a non-inline copy generated.

Maybe GCC does not think I would ever want the non-inline copy. But I can think of a case. Let's say I wanted to create a library and in the library I want a function defined in multiple translation units (so that the compiler could inline the code for the function in each translation unit) but I also want an external module linking to my library to be able to call the function defined in the library. I would obviously need a non-inlined version of the function for this.

One suggestion Agner gives if I don't want the non-inline copy is to use static inline. But from this question and answers I infer that this is only useful to show intent. So on the one hand it's clear it's more than just intent without using optimization since it makes a non-inline copy. But on the other had, with optimization it really only seems to show intent since the non-inline copy is stripped away. This is confusing.

My questions:

  1. Is GCC correct in stripping away the non-inline copy with optimization enabled? In other words should there always be a non-inline copy if I don't use static inline?
  2. If I want to be certain that there is not a non-inline copy should I use static inline?

I just realized that I could have misinterpreted Agner's statement. When he says function inlinng he may be referering to the compiler inclining the code and not to the use of the inline keyword. In other words he could be referring to functions defined with extern and not with inline or static.

for example

//foo.cpp
int foo(int x) {
    return x;
}

float bar(int x) {
    return 1.0*foo(x);
}

and

//main.cpp
#include <stdio.h>    
extern float bar(int x);    
int main(void) {
    printf("%f\n", bar(3));
}

compile with gcc -O3 foo.cpp main.cpp shows that foo was inline in bar but that a non-inlined copy of foo which is never used is in the binary.

回答1:

The standard says that the full definition of an inline method needs to be visible in every translation unit that uses it:

An inline function shall be defined in every translation unit in which it is odr-used and shall have exactly the same definition in every case (3.2). [...] If a function with external linkage is declared inline in one translation unit, it shall be declared inline in all translation units in which it appears; no diagnostic is required.

(7.1.2/4 in N4140)

This does indeed make the example in your question ill-formed.

This rule also includes every TU from whatever external module is linking your library. They would also need the full definition in C++ code, e.g. by defining the function in a header. So a compiler can safely omit any kind of "non-inlined copy" if the current translation does not need it.

Concerning being certain the copy does not exist: The standard does not guarantee any optimization, so that is up to the compiler. Both with and without an additional static keyword.



标签: c++ gcc inline