How to decrease the size of generated binaries?

2020-05-19 02:09发布

I know that there is an option "-Os" to "Optimize for size", but it has little affect, or even increase the size on some occasion :(

strip (or "-s" option) removes debug symbol table, which works fine; but it can only decrease only a small propotion of the size.

Is there any other way to go furthur?

7条回答
再贱就再见
2楼-- · 2020-05-19 02:32

It also depends on the architecture you are using.

On arm, you have the Thumb instruction set that is here to reduce the generated code size.

You can also avoid dynamic linking and prefer static linking for libs only used by your program or very few programs on your system. This will not decrease the size of your generated binary per se, but overall, you will use less space on your system for this program.

查看更多
爷、活的狠高调
3楼-- · 2020-05-19 02:32

You can try playing with -fdata-sections, -ffunction-sections and -Wl,--gc-sections, but this is not safe, so be sure to understand how they work before using them.

查看更多
贼婆χ
4楼-- · 2020-05-19 02:34

Assuming that another tool is also allowed ;-)

Then consider UPX: the Ultimate Packer for Binaries which uses runtime decompression.

Happy coding.

查看更多
劫难
5楼-- · 2020-05-19 02:40

When using strip(1), you'll want to make sure you use all the relevant options. For some reason, --strip-all doesn't always strip everything. Removing unnecessary sections may be helpful.

Ultimately, though, the best way to reduce the size of the binary is to remove code and static data from the program. Make it do less, or select programming constructs that result in fewer instructions. For example, you might build data structures at runtime, or load them from a file, on-demand, rather than have a statically initialized array.

查看更多
The star\"
6楼-- · 2020-05-19 02:42

You can also use -nostartfiles and/or -nodefaultlibs or the combo of both -nostdlib. In case you don't want a standard start file, you must write your own _start function then. See also this thread (archived) on oompf:

(quoting Perrin)

# man syscalls
# cat phat.cc
extern "C" void _start() {
        asm("int $0x80" :: "a"(1), "b"(42));
}
# g++ -fno-exceptions -Os -c phat.cc
# objdump -d phat.o

phat.o:     file format elf64-x86-64

Disassembly of section .text:

0000000000000000 <_start>:
   0:   53                      push   %rbx
   1:   b8 01 00 00 00          mov    $0x1,%eax
   6:   bb 2a 00 00 00          mov    $0x2a,%ebx
   b:   cd 80                   int    $0x80
   d:   5b                      pop    %rbx
   e:   c3                      retq
# ld -nostdlib -nostartfiles phat.o -o phat
# sstrip phat
# ls -l phat
-rwxr-xr-x 1 tbp src 294 2007-04-11 22:47 phat
# ./phat; echo $?
42

Summary: Above snippet yielded a binary of 294 bytes, each byte 8 bits.

查看更多
地球回转人心会变
7楼-- · 2020-05-19 02:47

Apart from the obvious (-Os -s), aligning functions to the smallest possible value that will not crash (I don't know ARM alignment requirements) might squeeze out a few bytes per function.
-Os should already disable aligning functions, but this might still default to a value like 4 or 8. If aligning e.g. to 1 is possible with ARM, that might save some bytes.

-ffast-math (or the less abrasive -fno-math-errno) will not set errno and avoid some checks, which reduces code size. If, like most people, you don't read errno anyway, that's an option.

Properly using __restrict (or restrict) and const removes redundant loads, making code both faster and smaller (and more correct). Properly marking pure functions as such eleminates function calls.

Enabling LTO may help, and if that is not available, compiling all source files into a binary in one go (gcc foo.c bar.c baz.c -o program instead of compiling foo.c, bar.c, and baz.c to object files first and then linking) will have a similar effect. It makes everything visible to the optimizer at one time, possibly allowing it to work better.

-fdelete-null-pointer-checks may be an option (note that this is normally enabled with any "O", but not on embedded targets).

Putting static globals (you hopefully don't have that many, but still) into a struct can eleminate a lot of overhead initializing them. I learned that when writing my first OpenGL loader. Having all the function pointers in a struct and initializing the struct with = {} generates one call to memset, whereas initializing the pointers the "normal way" generates a hundred kilobytes of code just to set each one to zero individually.

Avoid non-trivial-constructor static local variables like the devil (POD types are no problem). Gcc will initialize non-trivial-constructor static locals threadsafe unless you compile with -fno-threadsafe-statics, which links in a lot of extra code (even if you don't use threads at all).

Using something like libowfat instead of the normal crt can greatly reduce your binary size.

查看更多
登录 后发表回答