Is there a way to force an inline function in Clang/LLVM?
AFAIK, the following is just a hint to the compiler but it can ignore the request.
__attribute__((always_inline))
I don’t mind that the compilation will fail if it can’t inline the function.
There is a good solution if compiling with C99 which is Clang's default.
Its simply using inline attribute.
inline void foo() {}
It is well written in Clang's compatibility page:
By default, Clang builds C code according to the C99 standard, which provides different semantics for the inline keyword than GCC's default behavior...
In C99, inline means that a function's definition is provided only for inlining, and that there is another definition (without inline) somewhere else in the program. That means that this program is incomplete, because if add isn't inlined (for example, when compiling without optimization), then main will have an unresolved reference to that other definition. Therefore we'll get a (correct) link-time error...
GCC recognizes it as an extension and just treats it as a hint to the optimizer.
So in order to guarantee that the function is inlined:
- Don’t use static inline.
- Don’t add another implementation for the function that doesn't have inline attribute.
- You must use optimization. But even if there isn't optimization the compilation will fail which is good.
- Make sure not to compile with GNU89.
I am going to treat your question as asking for any tools within the Clang/LLVM framework. Here is my suggestion: compile your code to LLVM bitcode and then run the Always inline pass.
For example:
> clang <other CFLAGS> -emit-llvm -c -o foo.bc foo.c
> opt -always-inline foo.bc -o foo_inline.bc
> clang -c -o foo.o foo_inline.bc
I have used this sequence before and it has inlined all of my functions marked "always_inline". In my case, I was already doing other analyses and transforms on the bitcode, so I only had to add the flag to opt.
You can start with experimenting with:
clang -mllvm -inline-threshold=n
The greater the parameter n, the more agressive the inlining will be.
Default is 225 so set it to something bigger. Expect big code size and
long compilation times with very agressive inlining. When you hit the
point of diminishing returns you can try profiling the code and looking
for frequently called but uninlined functions and try marking them with
attribute((always_inline)) for even more inlining.
If you have functions marked "inline", you can also experiment with
-inlinehint-threshold bigger than -inline-threshold and see whether this
changes anything.
Also, are you compiling with link-time optimizations? Without them
inlining is limited to individual compilation units.
**taken from groups.google.com forum
Just a few remarks that might be useful too.
For the OP's comments:
- Multiple
static inline
definitions is a caveat because it can cause, upon changing one of them, multiple different functions that can cause lots of head-scratching, especially if inlining kicks in and the actual calls evaporate to different sequences of statements.
- This can have similar effects as 1.
- Inlining is an optimization and you can look into your compiler's manual to see when it kicks in (e.g. gcc doc page). Usually, it is in the first level. See also this answer.
A useful discussion and recommendation can be found here.
The advice for C99 is summed up as follows:
In a header file define the following and include it wherever it's required:
inline void foo() { /*...*/ }
In a single source file declare it using extern
to generate the external symbol:
extern inline foo();
As for the LLVM IR method proposed, it works but then you are passed the source language domain and subject to a different set of rules (highly dependent on the tool).
A brief indicative discussion can be found here.
Brute force method is just turning it into a macro.