As far as I know, x86 assembly code is very much constrained by the limited amount of registers.
When I learnt that on Linux, to create a .so file, one has to specify the -fPIC command line argument to gcc in order to create position independent code, I couldn't believe it first.
As far as I know, the elf file format supports relocations, just like the - in my eyes much better - Windows DLL system works: On Windows the linker relocates all the offsets in the DLLs, if this is necessary.
I think that the time needed to load a SO-file or DLL-file, and also the amount of memory used to keep differently relocated .so-files around is not as bad as the lack of a whole register all the time which points to the GOT and having all this indirect jumps.
I also don't care at all for ALSR etc. for the applications I have in mind were I only care about having code in a library to be optimized as much as possible.
1) Why does Linux not support more dynamic library loading like Windows which should produce much much more performant code?
So far I have found no real explanation for it. Just some things like that it would be so very bad and slow to relocate code (Of course, for loading a word processor on a desktop machine, it matters how fast it loads, I fully accept that. But for a computationally intensive server process (not processing malicious data from the internet), I'd like to have all the performance and registers I can get!
2) Is it possible for me to create NOT -fPIC compiled SO-files on Linux? Can I just leave the -fPIC away? Is there any howto, manual or project which works on this topic and makes it possible to not waste a whole register and still load libraries dynamically?
What happens if I just drop the -fPIC when compiling a .so-file?
The resulting shared object ELF file would (very probably) be dynamically loaded at semi-random (i.e. unpredictable) page addresses (e.g. because the
mmap
syscall will encounter ASLR).And the linker would produce a huge lot of relocation operations. So the dynamic linker (
ld.so
) would have to slowly process a big lot of relocations, so your text segment would have to be rewritten (and won't be efficiently read-only shared with other processes using the same.so
file).So in practice forgetting the
-fPIC
on a shared object (i.e. dynamically linked library) is most often a bad idea, even if it is possible.Read Drepper's HowTo do Dynamic Shared Libraries paper and Wheeler's Program Library Howto
BTW, position independent code is much more costly on x86 (32 bits) than on x86-64. But it is worth the effort (probably, PIC code is at most 5 to 10% slower than non-PIC on x86 32 bits).