Does __attribute__((always_inline))
force a function to be inlined by gcc?
相关问题
- Error building gcc 4.8.3 from source: libstdc++.so
- What are the recommended GNU linker options to spe
- What is the right order of linker flags in gcc?
- C# Java-like inline extension of classes?
- Why doesn't g++ -Wconversion warn about conver
相关文章
- gcc/g++ gives me error “CreateProcess: No such fil
- Calls that precede a function's definition can
- How can I use gcc's -I command to add recursiv
- How do I know if std::map insert succeeded or fail
- How to specify gcc flags (CXXFLAGS) particularly f
- C++: How to use unnamed template parameters in cla
- Kotlin inlined extension property
- How to generate assembly code with gcc that can be
Yes.
From documentation
Yes, it will. That doesn't mean it's a good idea.
According to the gcc optimize options documentation, you can tune inlining with parameters:
I suggest reading more in details about all the parameters for inlining, and setting them appropriately.
Yes. It will inline the function regardless of any other options set. See here.
It should. I'm a big fan of manual inlining. Sure, used in excess it's a bad thing. But often times when optimizing code, there will be one or two functions that simply have to be inlined or performance goes down the toilet. And frankly, in my experience C compilers typically do not inline those functions when using the inline keyword.
I'm perfectly willing to let the compiler inline most of my code for me. It's only those half dozen or so absolutely vital cases that I really care about. People say "compilers do a good job at this." I'd like to see proof of that, please. So far, I've never seen a C compiler inline a vital piece of code I told it to without using some sort of forced inline syntax (
__forceinline
on msvc__attribute__((always_inline))
on gcc).I want to add here that I have a SIMD math library where inlining is absolutely critical for performance. Initially I set all functions to inline but the disassembly showed that even for the most trivial operators it would decide to actually call the function. Both MSVC and Clang showed this, with all optimization flags on.
I did as suggested in other posts in SO and added
__forceinline
for MSVC and__attribute__((always_inline))
for all other compilers. There was a consistent 25-35% improvement in performance in various tight loops with operations ranging from basic multiplies to sines.I didn't figure out why they had such a hard time inlining (perhaps templated code is harder?) but the bottom line is: there are very valid use cases for inlining manually and huge speedups to be gained.
If you're curious this is where I implemented it. https://github.com/redorav/hlslpp