Consider the following function:
extern void test1(void);
extern void test2(void) {
test1();
}
This is the code gcc generates without -fpic
on amd64 Linux:
test2:
jmp test1
When I compile with -fpic
, gcc explicitly calls through the PLT to enable symbol interposition:
test2:
jmp test1@PLT
This however is not strictly needed for position independent code and could be left out if I don't want to support. If necessary, the linker rewrites the jump target to the PLT symbol anyway.
How can I, without changing the source code and without making the compiled code unsuitable for a shared library, make function calls go directly to their targets instead of going explicitly through the PLT?
If you can't change the source code, you could use a big-hammer: -Bsymbolic linker flag:
But beware that it will break if some parts of the library rely on symbol interposition. I'd recommend to go with hiding functions that don't need to be exported (by annotating them with
__attribute__((visibility("hidden")))
) or calling them through hidden aliases (specifically designed to do PLT-less intra-library calls in a controlled fashion).If you declare
test1()
hidden (__attribute__((__visibility__("hidden")))
, the jump will be direct.Now
test1()
may not be defined in its source translation unit as hidden, but I believe no harm should come from that discrepancy except the C language guarantee that&test1 == &test1
might be broken for you at runtime if one of the pointers was obtained via a hidden reference and one via a public one (the public reference might have been interposed via preloading or a DSO that came before the current one in the lookup scope, while the hidden reference (which results in direct jumps) effective prevents any kind of interposition)A more proper way to deal with this would be to define two names for
test1()
—a public name and a private/hidden name.In gcc and clang, this can be done with some alias magic, which can only be done in the translation unit that defines the symbol.
Macros can make it prettier:
The above compiles (-O3, x86-64) into:
(Defining HERE=1 additionally inlines the test1 call since it's small and local and -O3 is on).
Live example at https://godbolt.org/g/eZvmp7.