Why is C++ initial allocation so much larger than

2020-05-12 05:25发布

问题:

When using the same code, simply changing the compiler (from a C compiler to a C++ compiler) will change how much memory is allocated. I'm not quite sure why this is and would like to understand it more. So far the best response I've gotten is "probably the I/O streams", which isn't very descriptive and makes me wonder about the "you don't pay for what you don't use" aspect of C++.

I'm using the Clang and GCC compilers, versions 7.0.1-8 and 8.3.0-6 respectively. My system is running on Debian 10 (Buster), latest. The benchmarks are done via Valgrind Massif.

#include <stdio.h>

int main() {
    printf("Hello, world!\n");
    return 0;
}

The code used does not change, but whether I compile as C or as C++, it changes the results of the Valgrind benchmark. The values remain consistent across compilers, however. The runtime allocations (peak) for the program go as follows:

  • GCC (C): 1,032 bytes (1 KB)
  • G++ (C++): 73,744 bytes, (~74 KB)
  • Clang (C): 1,032 bytes (1 KB)
  • Clang++ (C++): 73,744 bytes (~74 KB)

For compiling, I use the following commands:

clang -O3 -o c-clang ./main.c
gcc -O3 -o c-gcc ./main.c
clang++ -O3 -o cpp-clang ./main.cpp
g++ -O3 -o cpp-gcc ./main.cpp

For Valgrind, I run valgrind --tool=massif --massif-out-file=m_compiler_lang ./compiler-lang on each compiler and language, then ms_print for displaying the peaks.

Am I doing something wrong here?

回答1:

The heap usage comes from the C++ standard library. It allocates memory for internal library use on startup. If you don't link against it, there should be zero difference between the C and C++ version. With GCC and Clang, you can compile the file with:

g++ -Wl,--as-needed main.cpp

This will instruct the linker to not link against unused libraries. In your example code, the C++ library is not used, so it should not link against the C++ standard library.

You can also test this with the C file. If you compile with:

gcc main.c -lstdc++

The heap usage will reappear, even though you've built a C program.

The heap use is obviously dependant to the specific C++ library implementation you're using. In your case, that's the GNU C++ library, libstdc++. Other implementations might not allocate the same amount of memory, or they might not allocate any memory at all (at least not on startup.) The LLVM C++ library (libc++) for example does not do heap allocation on startup, at least on my Linux machine:

clang++ -stdlib=libc++ main.cpp

The heap use is the same as not linking at all against it.

(If compilation fails, then libc++ is probably not installed. The package name usually contains "libc++" or "libcxx".)



回答2:

Neither GCC nor Clang are compilers -- they're actually toolchain driver programs. That means they invoke the compiler, the assembler, and the linker.

If you compile your code with a C or a C++ compiler you will get the same assembly produced. The Assembler will produce the same objects. The difference is that the toolchain driver will provide different input to the linker for the two different languages: different startups (C++ requires code for executing constructors and destructors for objects with static or thread-local storage duration at namespace level, and requires infrastructure for stack frames to support unwinding during exception processing, for example), the C++ standard library (which also has objects of static storage duration at namespace level), and probably additional runtime libraries (for example, libgcc with its stack-unwinding infrastructure).

In short, it's not the compiler causing the increase in footprint, it's the linking in of stuff you've chose to use by choosing the C++ language.

It's true that C++ has the "pay only for what you use" philosophy, but by using the language, you pay for it. You can disable parts of the language (RTTI, exception handling) but then you're not using C++ any more. As mentioned in another answer, if you don't use the standard library at all you can instruct the driver to leave that out (--Wl,--as-needed) but if you're not going to use any of the features of C++ or its library, why are you even choosing C++ as a programming language?