-O2 optimizes printf(“%s\\n”, str) to puts(str)

2019-06-23 02:39发布

问题:

Toying around with clang, I compiled a C program containing this line:

printf("%s\n", argv[0]);

When compiling without optimization, the assembly output called printf after setting up the registers:

movq    (%rcx), %rsi
movq    %rax, %rdi
movb    $0, %al
callq   _printf

I tried compiling with clang -O2. The printf call was replaced to a puts call:

movq    (%rsi), %rdi
callq   _puts

While this makes perfect sense in this case, it raises two questions:

  1. How often does function call substitution happen in optimized compilation? Is this frequent or is stdio an exception?
  2. Could I write compiler optimizations for my own libraries? How would I do that?

回答1:

  1. How often does function call substitution happen in optimized compilation? Is this frequent or is stdio an exception?

The optimization that replaces printf with puts in LLVM is in the class LibCallSimplifier. You can see the header file in llvm/include/llvm/Transforms/Utils/SimplifyLibCalls.h and the implementation in llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp. Looking at the files will give an example of some of the other library call replacement optimizations that are done (the header file is probably easier to start with). And of course there are other many other optimizations that LLVM does, you can get an idea of some of them by looking at the list of LLVM passes.

  1. Could I write compiler optimizations for my own libraries? How would I do that?

Yes you could. LLVM is very modular and performs transformation on the IR in a series of passes. So if you wanted to add a custom pass yourself for your own library you could do so (although it is still a fair amount of work to understand how the LLVM compiler flow works). A good starting point is the document: Writing an LLVM Pass.



回答2:

This sort of optimization depends on the compiler knowing that a function named printf can only be the printf function as defined by the C standard. If program defines printf to mean something else then the program is invoking undefined behaviour. This lets the compiler substitute a call to puts in cases where it would work "as if" the standard printf function was being called. It doesn't have worry about it working "as if" a user defined printf function was called. So these kind of function substitution optimizations are pretty much limited to functions defined in the C or C++ standards. (Maybe other standards as well if the compiler somehow knows that a given standard is in force.)

Short of modifying the source code of the compiler yourself there's no way tell the compiler that these kind of function substitutions are possible with your own functions. However, with limitations, you can do something similar with inline functions. For example you could implement something similar to the printf/puts optimization with something like this:

inline int myprintf(char const *fmt, char const *arg) {
    if (strcmp(fmt, "%s\n") == 0) {
         return myputs(args);
    }
    return _myprintf_impl(fmt, arg)
}

With optimization turned on the compiler can choose at compile time which function to call based on the fmt parameter, but only if it it can determine it's a constant string. If it can't, or optimization isn't enabled, then the compiler has to emit code that checks it on each call and that could easily turn this into a pessimization. Note that this optimization is dependent on the compiler knowing how strcmp works and removing the call entirely, and so is an example of another library function call substitution the compiler can make.

You can improve on this with GCC's __builtin_constant_p function:

inline int myprintf(char const *fmt, char const *arg) {
        if (__builtin_constant_p(fmt[0])
            && strcmp(fmt, "%s\n") == 0) {
                return myputs(arg);
        }
        return _myprintf_impl(fmt, arg);
}

Under GCC this results in code that never checks the format string a run time. If can determine at compile time that fmt is "%s\n" then it generates code that calls myputs unconditionally, otherwise it generates code that calls _myprintf_impl unconditionally. So with optimization enabled this function is never a pessimization. Unfortunately while clang supports the __builtin_constant_p function my version of clang always generates code that calls _myprintf_impl unconditionally.



回答3:

puts is a much smaller function than printf, executables are typically half the size. printf is only needed when converting numbers to strings for printing, you can do this using itoa()