Why do common C compilers include the source filen

2019-04-06 06:55发布

问题:

I have learnt from this recent answer that gcc and clang include the source filename somewhere in the binary as metadata, even when debugging is not enabled.

I can't really understand why this should be a good idea. Besides the tiny privacy risks, this happens also when one optimizes for the size of the resulting binary (-Os), which looks inefficient.

Why do the compilers include this information?

回答1:

The reason why GCC includes the filename is mainly for debugging purposes, because it allows a programmer to identify from which source file a given symbol comes from as (tersely) outlined in the ELF spec p1-17 and further expanded upon in some Oracle docs on linking.

An example of using the STT_FILE section is given by this SO question.

I'm still confused why both GCC and Clang still include it even if you specify -g0, but you can stop it from including STT_FILE with -s. I couldn't find any explanation for this, nor could I find an "official reason" why STT_FILE is included in the ELF specification (which is very terse).



回答2:

I have learnt from this recent answer that gcc includes the source filename somewhere in the binary as metadata, even when debugging is not enabled.

Not quite. In modern ELF object files the file name indeed is a symbol of type FILE:

$ readelf bignum.o    # Source bignum.c
[...]
Symbol table (.symtab) contains 36 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
     1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS bignum.c
     2: 0000000000000000     0 SECTION LOCAL  DEFAULT    1
     3: 0000000000000000     0 SECTION LOCAL  DEFAULT    3
     4: 0000000000000000     0 SECTION LOCAL  DEFAULT    4
     5: 0000000000000000     0 SECTION LOCAL  DEFAULT    5
     6: 0000000000000000     0 SECTION LOCAL  DEFAULT    6
     7: 0000000000000000     0 SECTION LOCAL  DEFAULT    7
     8: 0000000000000000     0 SECTION LOCAL  DEFAULT    8
     9: 00000000000003f0   172 FUNC    GLOBAL DEFAULT    1 add
    10: 00000000000004a0   104 FUNC    GLOBAL DEFAULT    1 copy

However, once stripped, the symbol is gone:

$ strip bignum.o
$ readelf -all bignum.o | grep bignum.c
$

So to keep your privacy, strip the executable, or compile/link with -s.