Variables and calls in a C program and its corresp

2019-08-25 01:18发布

问题:

I'm currently learning the Linux process address space and I'm not sure where these C variables correspond in the process address space.

I know that when a function is called, a new frame is created, it'll contain local variables and other function calls etc..

What I am not sure about is the pointers that are in the frame:

I have this function:

int main(){
    char *pointer1 = NULL;
    char *pointer2 = (void *)0xDDDDDDDD;
    pointer1 = malloc(80);
    strcpy(pointer1, "Testing..");
    return(0);
}

When main is called, a new frame is created.

Variables are initialized.

What I am not sure about these are the pointers, where does:

  • *pointer1 correspond to in the process address space - data or text section?

  • *pointer2 correspond to in the process address space - data or text section?

  • Does NULL and 0xDDDDDDDD belong to data or text section?

  • since pointer1 = malloc(80), does it belong to the stack section?

回答1:

First of all it should be noted that the C specification doesn't actually require local variables to be stored on a stack, it doesn't specify location of automatic variables at all.

With that said, the storage for the variables pointer1 and pointer2 themselves will most likely be put on a stack by the compiler. Memory for them will be part of the stack-frame created by the compiler when the main function is called.

To continue, on modern PC-like systems a pointer is really nothing more than a simple unsigned integer, and its value is the address where it points. The values you use for the initialization (NULL and 0xDDDDDDDD) are simply plain integer values. The initialization is done just the same as for a plain int variable. And as such, the values used for initialization doesn't really exists as "data", instead they could be encoded directly in the machine code, and as such will be stored in the "text" (code) segment.

Lastly for the dynamic allocation, it doesn't change where pointer1 is stored. What is does it simply assigning a new value to pointer1. The memory being allocated is on the "heap" which is separate from any program section (i.e. it's neither in the code, data or stack segments).



回答2:

As some programmer dude just said, the C spec does not state a region where automatic variables must be placed. But it is usual for compilers to grow the stack to accommodate them there. However, they might end on the .data region, and they will if they were, e.g., defined as static char *pointer1 instead.

The initialization values may or may not exist in a program region either. In your case, since the type of values is int, most architectures will inline the initialization as appropriate machine instructions instead, if instructions with appropriate inline operators are available. In x86_64, for example, a single mov/movq operation will be issued to put the 0 (NULL) or the other int in the appropriate memory location on the stack.

However, variables initialized with global scope, such as static char string[40] = "Hello world" or other initialized global variables end up on the .data region and take up space in there. Compilers may place declared, but undefined, global scoped variables on the .bss region instead.

The question since pointer1 = malloc(80), does it belong to the stack section? is ill-defined, because it comprises two things.

The value pointer1 is a value that will be saved at &pointer1. An address which, given the above consideration, the compiler may have put on the stack.

The result of malloc(80) is a value that refers to a region on the heap, a different region, dynamically allocated outside the mapped program space. On Linux, the result of calling malloc may even create a new NULL-backed memory region (that is, a transient region that is not permanently stored on a file; although it could be swapped by the kernel).

In essence, you could think of how malloc(80) behaves, as something like (not taking free() into consideration, so this is an oversimplification):

int space_left = 0; void *last_mapping = NULL;
void *malloc(int req) {
    void *result;
    if (space_left < req) {
        last_mapping = mmap(NULL, MALLOC_CHUNK_LENGTH, PROT_READ|PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
        space_left = MALLOC_CHUNK_LENGTH;
    }
    space_left -= req;
    result = last_mapping;
    last_mapping += req;
    return result;
}

The huge difference between calling malloc and mmap with MAP_PRIVATE is that mmap is a Linux System Call, which must make a kernel context switch, allocate a new memory map and reset the MMU layer for every memory chunk allocated, while malloc can be more intelligent and use a single big region as "heap" and manage the different malloc's and free's in userspace after the heap initialization (until the heap runs out of space, where it might have to manage multiple heaps).



回答3:

Last section of your doubts i.e. "since pointer1 = malloc(80), does it belong to the stack section? " , I can tell you

In C, dynamic memory is allocated from the heap using some standard library functions. The two key dynamic memory functions are malloc() and free().

The malloc() function takes a single parameter, which is the size of the requested memory area in bytes. It returns a pointer to the allocated memory. If the allocation fails, it returns NULL. The prototype for the standard library function is like this:

      void *malloc(size_t size);

The free() function takes the pointer returned by malloc() and de-allocates the memory. No indication of success or failure is returned. The function prototype is like this:

      void free(void *pointer);

You can refer the doc https://www.design-reuse.com/articles/25090/dynamic-memory-allocation-fragmentation-c.html