I'm currently learning the Linux process address space and I'm not sure where these C variables correspond in the process address space.
I know that when a function is called, a new frame is created, it'll contain local variables and other function calls etc..
What I am not sure about is the pointers that are in the frame:
I have this function:
int main(){
char *pointer1 = NULL;
char *pointer2 = (void *)0xDDDDDDDD;
pointer1 = malloc(80);
strcpy(pointer1, "Testing..");
return(0);
}
When main is called, a new frame is created.
Variables are initialized.
What I am not sure about these are the pointers, where does:
*pointer1
correspond to in the process address space - data or text section?
*pointer2
correspond to in the process address space - data or text section?
Does NULL
and 0xDDDDDDDD
belong to data or text section?
since pointer1 = malloc(80)
, does it belong to the stack section?
First of all it should be noted that the C specification doesn't actually require local variables to be stored on a stack, it doesn't specify location of automatic variables at all.
With that said, the storage for the variables pointer1
and pointer2
themselves will most likely be put on a stack by the compiler. Memory for them will be part of the stack-frame created by the compiler when the main
function is called.
To continue, on modern PC-like systems a pointer is really nothing more than a simple unsigned integer, and its value is the address where it points. The values you use for the initialization (NULL
and 0xDDDDDDDD
) are simply plain integer values. The initialization is done just the same as for a plain int
variable. And as such, the values used for initialization doesn't really exists as "data", instead they could be encoded directly in the machine code, and as such will be stored in the "text" (code) segment.
Lastly for the dynamic allocation, it doesn't change where pointer1
is stored. What is does it simply assigning a new value to pointer1
. The memory being allocated is on the "heap" which is separate from any program section (i.e. it's neither in the code, data or stack segments).
As some programmer dude just said, the C spec does not state a region where automatic variables must be placed. But it is usual for compilers to grow the stack to accommodate them there. However, they might end on the .data region, and they will if they were, e.g., defined as static char *pointer1
instead.
The initialization values may or may not exist in a program region either. In your case, since the type of values is int
, most architectures will inline the initialization as appropriate machine instructions instead, if instructions with appropriate inline operators are available. In x86_64, for example, a single mov
/movq
operation will be issued to put the 0 (NULL) or the other int in the appropriate memory location on the stack.
However, variables initialized with global scope, such as static char string[40] = "Hello world"
or other initialized global variables end up on the .data
region and take up space in there. Compilers may place declared, but undefined, global scoped variables on the .bss
region instead.
The question since pointer1 = malloc(80), does it belong to the stack section? is ill-defined, because it comprises two things.
The value pointer1
is a value that will be saved at &pointer1
. An address which, given the above consideration, the compiler may have put on the stack.
The result of malloc(80)
is a value that refers to a region on the heap, a different region, dynamically allocated outside the mapped program space.
On Linux, the result of calling malloc
may even create a new NULL-backed memory region (that is, a transient region that is not permanently stored on a file; although it could be swapped by the kernel).
In essence, you could think of how malloc(80) behaves, as something like (not taking free() into consideration, so this is an oversimplification):
int space_left = 0; void *last_mapping = NULL;
void *malloc(int req) {
void *result;
if (space_left < req) {
last_mapping = mmap(NULL, MALLOC_CHUNK_LENGTH, PROT_READ|PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
space_left = MALLOC_CHUNK_LENGTH;
}
space_left -= req;
result = last_mapping;
last_mapping += req;
return result;
}
The huge difference between calling malloc
and mmap
with MAP_PRIVATE is that mmap is a Linux System Call, which must make a kernel context switch, allocate a new memory map and reset the MMU layer for every memory chunk allocated, while malloc can be more intelligent and use a single big region as "heap" and manage the different malloc's and free's in userspace after the heap initialization (until the heap runs out of space, where it might have to manage multiple heaps).
Last section of your doubts i.e. "since pointer1 = malloc(80), does it belong to the stack section? " , I can tell you
In C, dynamic memory is allocated from the heap using some standard library functions. The two key dynamic memory functions are malloc() and free().
The malloc() function takes a single parameter, which is the size of the requested memory area in bytes. It returns a pointer to the allocated memory. If the allocation fails, it returns NULL. The prototype for the standard library function is like this:
void *malloc(size_t size);
The free() function takes the pointer returned by malloc() and de-allocates the memory. No indication of success or failure is returned. The function prototype is like this:
void free(void *pointer);
You can refer the doc
https://www.design-reuse.com/articles/25090/dynamic-memory-allocation-fragmentation-c.html