Questions about shared libraries

2020-04-16 02:15发布

问题:

I have some question regarding shared libraries

  1. Who will load shared libraries when I run binary depends on shared libraries(.so)?

  2. Where shared libraries loaded?

  3. If shared libraries were already loaded and I run binary depends on loaded libraries, in this case shared libraries will going to be loaded or binary will use loaded libraries?

回答1:

Who will load shared libraries when I run binary depends on shared libraries(.so)?

When you execs the binary, the Linux kernel will read elf header of your file. All dynamically linked ELF files has /lib/ld-linux.so.2 (the runtime dynamic linker) registered in the program header .interp (interpreter) in ELF file. When interpreter is present, Linux kernel will load ELF of the interpreter (by mmaping it into memory according to its header), and jump into its Entry point.

The runtime dynamic linker will read your dynamically linked program, find all needed shared libraries and load them into memory (again using the mmap and information from ELF headers).

Where shared libraries loaded?

Shared libraries are searched in all directories, listed in runtime library search path ($LD_LIBRARY_PATH and /etc/ld.so.conf).

The memory address which is used to load each library, is determined by ld-linux.so.2 (and possibly by kernel, for example to randomize the starting address).

Actual code of library loading is in glibc, elf/rtld.c file http://fxr.watson.org/fxr/source/elf/rtld.c?v=GLIBC27#L1731 :

 1731   /* Load all the libraries specified by DT_NEEDED entries. ....  */
 1735   _dl_map_object_deps (main_map, preloads, npreloads, mode == trace, 0);

Then linker will connect symbol references between object files (relocation process, it sometimes done when the symbol is actually referenced if lazy binding is active).

If shared libraries were already loaded and I run binary depends on loaded libraries, in this case shared libraries will going to be loaded or binary will use loaded libraries?

You should know how mmap works with files. There is file stored on Hard Drive (HDD or SSD), and it should be loaded into memory to be executed. Linker will not mmap entire library file; only sections with library data and code. Also, the mmap syscall is lazy, it doesn't load all requested pieces of file into memory, but just remember corresponding virtual pages and file offsets. On the first access to the virtual page, pagefault (major pagefault) will occur, and the part of file will be read from HDD (Linux may load more pages from disk; there are also prefetchers which read libraries to memory early in boot process).

If several processes mmaps the same file, the Copy-On-Write mechanism will be used. It means: if the memory page was only read, there will be one physical page. Several virtual pages will be mapped to it; all with disallowed "write" access. For every write access on the page, the Copy will be done (via minor pagefault), the original physical page if copied into new physical page; and the mapping will be changed for the process which did write access. After return from pagefault interrupt, the write instruction will be restarted, doing the write into own copy of the page.

Most executable code of shared libraries is not written to, so it is shared between all processes. The data segments (.data, .bss, .tdata, .bss) will be not shared, because there are writes to them. Relocations will also unshare some pages.



回答2:

The answer to your questions 1 and 2 is: it depends (on the OS you are using). On most UNIX OSes, the runtime loader (usually ld.so or ld-linux.so) will load shared libraries, wherever it pleases to do so.

For question 3, usually shared libraries are shared between processes, so yes: a newly-loaded executable will re-use shared libraries that are already loaded by some other process. Note: only code (and read-only data) segment is shared; each process gets its own copy of writable data segment.

Looking for an answer drawing from credible and/or official sources.

For Linux, this guide details the process of shared library loading (likely with more details than you care about).

Also, this book has an early draft available on-line.