Analyzing an ELF binary to minimize its size

2019-05-05 09:39发布

问题:

I'm cross-compiling a V8 project to an embedded ARM target using the GCC arm-gnueabi cross compiler. I got the V8 library itself cross-compiled successfully, and as a smoke test I wanted to link it to Google's hello world example and run it on the ARM board.

The libraries themselves clock in at a bit over 1.2 MB:

v8 % find out/arm.release/obj.target/ -name '*.a' -exec du -h {} + 
1.2M    out/arm.release/obj.target/tools/gyp/libv8_base.a
12K     out/arm.release/obj.target/tools/gyp/libv8_libbase.a
4.0K    out/arm.release/obj.target/tools/gyp/libv8_libplatform.a
4.0K    out/arm.release/obj.target/tools/gyp/libv8_snapshot.a
4.0K    out/arm.release/obj.target/tools/gyp/libv8_nosnapshot.a
4.0K    out/arm.release/obj.target/third_party/icu/libicudata.a
164K    out/arm.release/obj.target/third_party/icu/libicuuc.a
336K    out/arm.release/obj.target/third_party/icu/libicui18n.

Yet when I build and link with

arm-linux-gnueabi-g++ -pthread -Iv8/include hi.cpp -Os -o hi_v8 -Wl,--start-group v8/out/arm.release/obj.target/{tools/gyp/libv8_{base,libbase,snapshot},third_party/icu/libicu{uc,i18n,data}}.a -Wl,--end-group

I get an executable that's 20 MB. Stripping it only gets me down to 17 MB

What is being linked in that balloons the file size so much? How can I avoid it? What tools can I use to diagnose the problem? This size may be problematic on the platform I am targeting.

I've taken a look at readelf --sections, but it just tells me how large the .text section is overall, which isn't particularly helpful. I also took a look at the suggestions here and tried using nm, but it's too specific - I just a bunch of name-mangled symbols like _ZN2v88internal11FLAG_log_gcE.

回答1:

First, if you haven't already, use size -A hi_v8 to determine what section or sections are bigger than you expect. It's not always the text section.

Next add -Wl,-Map,hi_v8.map to the g++ command line. This will generate a linker map in the file hi_v8.map. The contents of the file will be very verbose, but it'll will show the contribution of each of object files to each section in the executable.

The linker map will have a number of sections. The first section "Archive member included because of file (symbol)" is helpful for figuring what caused an object be linked into the executable, but not so much what's inflating the size of the executable. Once you figured that out, and it turns to be an errant library, you can come back to this section. The "Allocating common symbols", "Discarded input sections", "Memory Configuration" sections are probably not going to be very helpful.

The "Linker script and memory map" section is where you want to be focusing your attention. It's essentially a trace of the linker script used to produce the executable. First check the LOAD statements at the start, they show every file that the executable was linked with. Check to see if there's any files you didn't expect to see, however libraries will be mentioned here even none of their object files were linked into the executable.

Now you'll have to wade through the trace of every linked object file and its symbols being adding to every section of the executable. Since it's just a "Hello World" problem it shouldn't be too bad. Skip to the section that you've identified as being the problem. Now scan through the list of objects and see if you can find either where a large number of not obviously necessary object files are being linked in or where the address suddenly jumps by a large amount. The later should be relatively easy to spot, but the former can be hard to identify. It might help if you generate linker map for a statically linked "Hello World" program on your native platform to see the sort of library routines it links in.

My guess is that your problem will show up as either a large jump in the address, or some file that obviously that shouldn't be linked in. So don't be put off by the verboseness of the map file. The C++ symbols should also be demangled so it won't be as bad as nm. (Though you can get nm to also demangle the names with the -C option.)



回答2:

I do not know ARM, so pmap may not be available to you.... Sorry if this cannot fly for you. Gnu distributions usually have the pmap command.

Rewrite & compile hello world to remain running (memory resident) until you hit return, an fgets call will work.

Next, in a separate window, execute

pmap -d pid

Where pid is the pid of the hello world process. pmap -d shows sizes of every linked object.



回答3:

It's also possibly worth enabling linker garbage collection, assuming the toolchain supports it, it isn't already enabled by default, and the linker script is written correctly. See https://sourceware.org/binutils/docs/ld/Options.html#index-g_t_002d_002dgc_002dsections-173.



标签: gcc linker arm v8 elf