Where are GDB symbols coming from?

2020-03-20 13:37发布

问题:

When I load Fedora 28's /usr/bin/ls file into GDB, I can access to the symbol abformat_init, even if it is not present as a string nor in the symbols table of the binary file.

$ file /usr/bin/ls
/usr/bin/ls: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, BuildID[sha1]=d6d0ea6be508665f5586e90a30819d090710842f, stripped, too many notes (256)
$ readelf -S /usr/bin/ls | grep abformat
$ nm /usr/bin/ls
nm: /usr/bin/ls: no symbols
$ strings /usr/bin/ls | grep abformat
$ gdb /usr/bin/ls
[...]
Reading symbols from /usr/bin/ls...Reading symbols from /usr/bin/ls...(no debugging symbols found)...done.
(no debugging symbols found)...done.
Missing separate debuginfos, use: dnf debuginfo-install coreutils-8.29-7.fc28.x86_64
(gdb) info symbol abformat_init 
abformat_init in section .text of /usr/bin/ls

Where does this symbol comes from? Is there a program that allows to extract them outside of GDB?

回答1:

TL;DR:

  1. There is a special .gnu_debugdata compressed section in Fedora binaries that GDB reads, and which contains mini-symbols.
  2. Contents of that section can be conveniently printed with eu-readelf -Ws --elf-section /usr/bin/ls

readelf -S /usr/bin/ls | grep abformat

That command is dumping sections. You want symbols instead:

readelf -s /usr/bin/ls | grep abformat
readelf --all /usr/bin/ls | grep abformat

strings /usr/bin/ls | grep abformat

Strings tries to guess what you want, and doesn't output all strings found in the binary. See this blog post and try:

strings -a /usr/bin/ls | grep abformat

Update: I confirmed the results you've observed: abformat does not appear anywhere, yet GDB knows about it.

Turns out, there is a .gnu_debugdata compressed section (described here), which has mini-symbols.

To extract this data, normally you would do:

objcopy -O binary -j .gnu_debugdata /usr/bin/ls ls.mini.xz

However, that is broken on my system (produces empty output), so instead I used dd:

# You may need to adjust the numbers below from "readelf -WS /usr/bin/ls"
dd if=/usr/bin/ls of=ls.mini.xz bs=1 skip=151896 count=3764
xz -d ls.mini.xz
nm ls.mini | grep abformat

This produced:

00000000000005db0 t abformat_init

QED.

Additional info:

  1. Confusing GDB no debugging symbols is addressed in this bug.
  2. objcopy refusing to copy .gnu_debugdata is the subject of this bug.
  3. There is a tool that can conveniently dump this info:

    eu-readelf -Ws --elf-section /usr/bin/ls | grep abformat 37: 0000000000005db0 593 FUNC LOCAL DEFAULT 14 abformat_init



回答2:

Is there a program that allows to extract them outside of GDB?

Yes, you can use nm to extract the symbol, but you should look for the symbol in a separate debug info file, because the binary itself is stripped.

You can use readelf or objdump to know separate debug info file name, see How to know the name and/or path of the debug symbol file which is linked to a binary executable?:

$ objdump -s -j .gnu_debuglink /usr/bin/ls

/usr/bin/ls:     file format elf64-x86-64

Contents of section .gnu_debuglink:
 0000 6c732d38 2e33302d 362e6663 32392e78  ls-8.30-6.fc29.x
 0010 38365f36 342e6465 62756700 5cddcc98  86_64.debug.\...

On Fedora 29 the separate debug info file name for /usr/bin/ls is ls-8.30-6.fc29.x86_64.debug.

Normally, on Fedora, separate debug info is installed to /usr/lib/debug/ directory so the full path to debug info file is /usr/lib/debug/usr/bin/ls-8.30-6.fc29.x86_64.debug.

Now you can look for the symbol with nm:

$ nm /usr/lib/debug/usr/bin/ls-8.30-6.fc29.x86_64.debug | grep abformat_init
0000000000006d70 t abformat_init

Note that separate debug info should be installed with debuginfo-install, this is what gdb is telling you.