In Agner Fog's Optimizing C++ manual he has a section "Inlined functions have a non-inlined copy" where he writes
Function inlining has the complication that the same function may be called from another module. The compiler has to make a non-inlined copy of the inlined function for the sake of the possibility that the function is also called from another module. This non-inlined copy is dead code if no other modules call the function. This fragmentation of the code makes caching less efficient.
Let's make a test for this.
foo.h
inline double foo(double x) {
return x;
}
t1.cpp
#include "foo.h"
double t1(double x) {
return foo(x);
}
main.cpp
#include <stdio.h>
extern double foo(double);
int main(void) {
printf("%f\n", foo(3.14159));
}
compile with g++ t1.cpp main.cpp
and it runs correctly. If I do g++ -S t1.cpp main.cpp
and look at the assembly I see that main.s
calls a function defined in t1.s
. Doing g++ -c main.cpp
and g++ t1.cpp
and looking at the symbols with nm
shows U _Z3food
in main.o
and W _Z3food
in t1.o
. So it's clear that Agner's claim of there being a non-inlined copy is correct.
What about with g++ -O1 t1.cpp main.cpp
? This fails to compile due to foo
being undefined. Doing g++ -O1 t1.cpp
and nm t1.o
shows that _Z3food
has been stripped out.
Now I am confused. I did not expect g++ to remove the non-inline copy with optimization enabled.
It seems that with optimization enabled inline
is equivalent to static inline
. But without optimization inline
means there is a non-inline copy generated.
Maybe GCC does not think I would ever want the non-inline copy. But I can think of a case. Let's say I wanted to create a library and in the library I want a function defined in multiple translation units (so that the compiler could inline the code for the function in each translation unit) but I also want an external module linking to my library to be able to call the function defined in the library. I would obviously need a non-inlined version of the function for this.
One suggestion Agner gives if I don't want the non-inline copy is to use static inline
. But from this question and answers I infer that this is only useful to show intent. So on the one hand it's clear it's more than just intent without using optimization since it makes a non-inline copy. But on the other had, with optimization it really only seems to show intent since the non-inline copy is stripped away. This is confusing.
My questions:
- Is GCC correct in stripping away the non-inline copy with optimization enabled? In other words should there always be a non-inline copy if I don't use
static inline
? - If I want to be certain that there is not a non-inline copy should I use
static inline
?
I just realized that I could have misinterpreted Agner's statement. When he says function inlinng he may be referering to the compiler inclining the code and not to the use of the inline
keyword. In other words he could be referring to functions defined with extern
and not with inline
or static
.
for example
//foo.cpp
int foo(int x) {
return x;
}
float bar(int x) {
return 1.0*foo(x);
}
and
//main.cpp
#include <stdio.h>
extern float bar(int x);
int main(void) {
printf("%f\n", bar(3));
}
compile with gcc -O3 foo.cpp main.cpp
shows that foo
was inline in bar
but that a non-inlined copy of foo
which is never used is in the binary.