I am working on a multiprocessor architectural simulator that uses Intel Pin to instrument C++ executable binaries and report interesting events (e.g., some function calls, thread create/finish, etc.). Basically, I build an instruction-decode cache of all instructions when their images are loaded and analyze instruction execution afterwards. So it is important for instruction addresses at image-load time to be the same as (or at least get updated synchronously with) instruction addresses at run-time.
Intel Pin API (e.g., IMG_AddInstrumentFunction) enables me to get information about the loaded images (executables and shared libraries) such as entry points, low/high address, etc.
However, I noticed that the instrumented program executes instructions at addresses that do not belong to any of the loaded images. By inspection, I am suspecting that the dynamic loader (image /lib64/ld-linux-x86-64.so.2 on 64-bit Centos 6.3) is relocating the main executable in memory by calling routine _dl_relocate_object.
I understand the need for relocatable code and all that stuff. I just need pointers to a good documentation (or just a brief description/advice) on how/when these relocations might happen (at load-time and runtime) so that I can take them into account in my architectural simulator. In other words, the mechanism used to achieve it (library functions that I need to instrument, conditions, or maybe randomization if there is any, g++ compiler switches that can be used to suppress relocation, etc). P.S.: I am only targeting x86/x86_64 architectures