I have a C++ library that generates much larger code that I would really expect for what it is doing. From less than 50K lines of source I get shared objects that are almost 4 MB and static archives pushing 9. This is problematic both because the library binaries are quite large, and, much worse, even simple applications linking against it typically gain 500 to 1000 KB in code size. Compiling the library with flags like -Os helps this somewhat, but not really very much.
I have also experimented with GCC's -frepo command (even though all the documentation I've seen suggests that on Linux collect2 will merge duplicate templates anyway) and explicit template instantiation on templates that seemed "likely" to be duplicated a lot, but with no real effect in either case. Of course I say "likely" because, as with any kind of profiling, blind guessing like this is almost always wrong.
Is there some tool that makes it easy to profile code size, or some other way I can figure out what is taking up so much room, or, more generally, any other things I should try? Something that works under Linux would be ideal but I'll take what I can get.
One method that is very crude but very quick is to look at the size of your object files. Not all the code in the object files will be compiled into the final binary, so there may be a few false positives, but it can give a good impression of where the hotspots will be. Once you've found the largest object files you can then delve into them with tools like
objdump
andnm
.On Linux the linker certainly does merge multiple template instantiations.
Make sure you aren't measuring debug binaries (debug info could take up more than 75% of the final binary size).
One technique to reduce final binary size is to compile with
-ffunction-sections
and-fdata-sections
, then link with-Wl,--gc-sections
.Even bigger reduction (we've seen 25%) may be possible if you use development version of
[gold][1]
(the new ELF-only linker, part of binutils), and link with-Wl,--icf
Another useful technique is reducing the set of symbols which are "exported" by your shared libraries (everything is exported by default), either via
__attribute__((visibility(...)))
, or by using linker script. Details here (see "Export control").If you want to find out what is being put into your executable, then ask your tools. Turn on the ld linker's --print-map (or -M) option to produce a map file showing what it has put in memory and where. Doing this for the static linked example is probably more informative.
If you're not invoking ld directly, but only via the gcc command line, you can pass ld specific options to ld from the gcc command line by preceding them with
-Wl,
.