How is it that main function is always loaded at t

2019-05-06 23:21发布

I wrote this small program today and I was blown away by the results. Here is the program


int main(int argc, char **argv)
{
 int a;
 printf("\n\tMain is located at: %p and the variable a is located at address: %p",main,&a);
 return 0;
}

on my machine the main function is always loaded at address "0x80483d4" and the address of the variable keeps on varying How does this happen? I read in operating systems that as a part of virtualization scheme the OS keeps relocating the address of instructions. So why is it that everytime I run this program that main is loaded at the same address?

thanks in advance guys.

2条回答
聊天终结者
2楼-- · 2019-05-06 23:59

On ELF systems such as Linux, the addresses at which the segments of normal executable files (ELF type ET_EXEC) load are fixed at compile time. Shared objects (ELF type ET_DYN) such as libraries are built to be position-independent, with their segments loadable anywhere in the address space (potentially with some restrictions on some architectures). It is possible to build executables such that they are actually ET_DYN -- these are known as "position-independent executables" (PIE), but is not a common technique.

What you are seeing is the fact that your main() function is in the fixed-address text segment of your compiled executable. Try also printing the address of a library function such as printf() after locating it via dlsym() -- if your system does support and have enabled address space layout randomization (ASLR), then you should see the address of that function change from run to run of your program. (If you just print the address of the library function by putting the reference directly in your code, what you may actually get is the address of the function's procedure lookup table (PLT) trampoline, which is statically compiled at a fixed address in your executable.)

The variable you see change address from run-to-run because it is an automatic variable created on the stack, not in statically allocated memory. Depending on OS and version, the address of the base of the stack may shift from run to run even without ASLR. If you move the variable declaration to be a global outside of your function, you see it behave the same way your main() function does.

Here's a full example -- compile with something like gcc -o example example.c -dl:

#include <stdio.h>
#include <dlfcn.h>

int a = 0;

int main(int argc, char **argv)
{
    int b = 0;
    void *handle = dlopen(NULL, RTLD_LAZY);
    printf("&main: %p; &a: %p\n", &main, &a);
    printf("&printf: %p; &b: %p\n", dlsym(handle, "printf"), &b);
    return 0;
}
查看更多
贼婆χ
3楼-- · 2019-05-07 00:23

main(...) is a runtime start-up library code in which the operating system loads and executes each time. Have a look at the CRT (C Runtime Library) which would contain the code to do this depending on your compiler.

Another thing to bear in mind, that address - I would not worry about it too much as long as the C code works. That is a fluke pattern, by order of dependant on a number of factors such as OS load, drivers used, hardware, AntiVirus software etc...

Also, in relation to the code, if you add static variables, functions, pointers, that will change the layout of the binary code, and more importantly, the addresses of those symbols that gets loaded at run-time, will be different.

查看更多
登录 后发表回答