Debugging the C runtime

2019-05-12 00:29发布

I want to get a detailed look at what's going on both before and after main() using GDB. Would it be enough just to recompile glibc with -g and link against that?

2条回答
闹够了就滚
2楼-- · 2019-05-12 01:02

if you want to play with the debugger, you can use GDB this way:

  • install the debug-info for the `glibc` package (here is the way to do it with Fedora, I don't know about the other distros)
  • or point GDB to a consistent debug file directory:
(gdb) show debug-file-directory
The directory where separate debug symbols are searched for is "/usr/lib/debug".
(gdb) set debug-file-directory ...

(it's /usr/lib/debug/lib64/libc-2.14.so.debug in my system)

  • tell GDB to show the backtrace before your `main`:
(gdb) show backtrace past-entry
Whether backtraces should continue past the entry point of a program is off.
(gdb) set backtrace past-entry on
  • then you should see what you're looking for, and navigate through it:
(gdb) where
#0  main () at test.c:4
#1  __libc_start_main (main=0x40050f <main>, argc=1,...) at libc-start.c:226
#2  _start ()
查看更多
看我几分像从前
3楼-- · 2019-05-12 01:27

You don't need to start in the debugger.

When the OS loads your executable, it passes control to its entry point which is not the function named main(). In GCC and glibc, the true entry point is usually named _start, but your mileage can vary depending on your platform. Of course, if you aren't using glibc, or are using a different C compiler, then it can vary even more.

The key job of the code at _start is to initialize the everything that is required in order to create the conditions that main() expects. Note that this is much more complex for C++, and since GCC supports both languages, the true startup code will have extra features whose sole purpose is to support the requirements of C++.

Source code for the _start is almost always written in assembler, and is highly platform-specific. For the 32-bit x86 platform, one sample can be found in the glibc source tree, under sysdeps/i386/elf/start.S.

Although it is likely true that you will never need to see this to debug ordinary code on desktop operating systems, a good understanding of how the runtime environment is initialized is often needed when working on small embedded systems. In particular, many embedded systems boot directly from system reset into a version of this startup code. On such a system, it isn't unusual to have to turn on the memory that will be used or correctly configure the CPU's main clock sources and set the first stack pointer to something sensible before it is possible to worry about higher level concepts like the .text, .data and .bss segments.

The version of start.S linked to assumes that it is being launched under some flavor of unix or linux (I didn't look too carefully). So it gets to assume that the process has been created and that the code and data segments are already loaded and ready to use. It converts the command line parameters from the format supplied by the OS to the familiar arvc and argv[] needed to call main(), which it does but via a wrapper supplied somewhere else in the glibc sources named __libc_start_main() found in csu/libc-start.c.

The source to that function is made to appear hugely complex by the abundance of condition complilation directives that support a wide range of features. But in essence, it boils down to something like the following for a common case:

STATIC int
__libc_start_main(int (*main) (int, char **, char **),
                 int argc, char **av,
                 int (*init)(int, char **, char **),
                 void (*fini) (void),
                 void (*rtld_fini) (void), void *__unbounded stack_end)
{

    int result;
    /* some basic initializations goes here, then... */
    /* initialize some core parts of the library */
    __libc_init_first (argc, argv, __environ);
    /* arrange to call finalizers at exit if any */
    if (fini)
        __cxa_atexit ((void (*) (void *)) fini, NULL, NULL);
    /* call initializers, if any */
    if (init)
        init(argc, argv, __environ);
    /* call user's actual main, which might not return */
    result = main (argc, argv, __environ);
    /* if main did return, exit appropriately */
    exit (result);
}

I've left out some details in that sketch, but the outline should be mostly true. The funny business with function pointers named init and fini is primarily to support constructors and destructors of global objects in a C++ program. For plain C linkage, these pointers will be NULL, and there will be no effect.

查看更多
登录 后发表回答