I have some question regarding shared libraries
Who will load shared libraries when I run binary depends on shared libraries(.so)?
Where shared libraries loaded?
If shared libraries were already loaded and I run binary depends on loaded libraries, in this case shared libraries will going to be loaded or binary will use loaded libraries?
When you
exec
s the binary, the Linux kernel will read elf header of your file. All dynamically linked ELF files has/lib/ld-linux.so.2
(the runtime dynamic linker) registered in the program header.interp
(interpreter) in ELF file. When interpreter is present, Linux kernel will load ELF of the interpreter (by mmaping it into memory according to its header), and jump into its Entry point.The runtime dynamic linker will read your dynamically linked program, find all needed shared libraries and load them into memory (again using the
mmap
and information from ELF headers).Shared libraries are searched in all directories, listed in runtime library search path (
$LD_LIBRARY_PATH
and/etc/ld.so.conf
).The memory address which is used to load each library, is determined by
ld-linux.so.2
(and possibly by kernel, for example to randomize the starting address).Actual code of library loading is in glibc,
elf/rtld.c
file http://fxr.watson.org/fxr/source/elf/rtld.c?v=GLIBC27#L1731 :Then linker will connect symbol references between object files (relocation process, it sometimes done when the symbol is actually referenced if lazy binding is active).
You should know how
mmap
works with files. There is file stored on Hard Drive (HDD or SSD), and it should be loaded into memory to be executed. Linker will notmmap
entire library file; only sections with library data and code. Also, themmap
syscall is lazy, it doesn't load all requested pieces of file into memory, but just remember corresponding virtual pages and file offsets. On the first access to the virtual page,pagefault
(major pagefault) will occur, and the part of file will be read from HDD (Linux may load more pages from disk; there are also prefetchers which read libraries to memory early in boot process).If several processes
mmap
s the same file, the Copy-On-Write mechanism will be used. It means: if the memory page was only read, there will be one physical page. Several virtual pages will be mapped to it; all with disallowed "write" access. For every write access on the page, the Copy will be done (via minor pagefault), the original physical page if copied into new physical page; and the mapping will be changed for the process which did write access. After return from pagefault interrupt, the write instruction will be restarted, doing the write into own copy of the page.Most executable code of shared libraries is not written to, so it is shared between all processes. The data segments (.data, .bss, .tdata, .bss) will be not shared, because there are writes to them. Relocations will also unshare some pages.
The answer to your questions 1 and 2 is: it depends (on the OS you are using). On most UNIX OSes, the runtime loader (usually
ld.so
orld-linux.so
) will load shared libraries, wherever it pleases to do so.For question 3, usually shared libraries are shared between processes, so yes: a newly-loaded executable will re-use shared libraries that are already loaded by some other process. Note: only code (and read-only data) segment is shared; each process gets its own copy of writable data segment.
For Linux, this guide details the process of shared library loading (likely with more details than you care about).
Also, this book has an early draft available on-line.