Is the memory chunk returned by malloc (and its co

2020-04-06 16:04发布

问题:

I wrote a code to test to stress test the memory management of Linux and Windows OS. Just for further tests I went ahead and checked what values are present in the memory returned by malloc().

The values that are being return are all 0 (zero). I have read the man page of malloc, checked on both Windows and Linux, but I am not able to find the reason for this behavior. According to the manpage the

The malloc() function allocates size bytes and returns a pointer to the allocated memory. The memory is not initialized.

To clear the memory segment, one has to manually use memset().

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#include <stdbool.h>

int eat(long total,int chunk){
    long i;
    for(i=0;i<total;i+=chunk){
        short *buffer=malloc(sizeof(char)*chunk);     
        if(buffer==NULL){
            return -1;
        }
        printf("\nDATA=%d",*buffer);
        memset(buffer,0,chunk);
    }
    return 0;
}

int main(int argc, char *argv[]){
    int i,chunk=1024;
    long size=10000;
            printf("Got %ld bytes in chunks of %d...\n",size,chunk);
            if(eat(size,chunk)==0){
                printf("Done, press any key to free the memory\n");
                getchar();
            }else{
                printf("ERROR: Could not allocate the memory");
            }
        }

Maybe I am missing something. The code is adapted from here

EDIT: The problem has been been answered here for the GCC specific output. I believe Windows operating system would be also following the same procedures.

回答1:

The memory returned by malloc() is not initialized, which means it may be anything. It may be zero, and it may not be; 'not initialized' means it could be anything (zero included). To get a guaranteed zeroed page use calloc().

The reason you are seeing zeroed pages (on Linux anyway) is that if an application requests new pages, these pages are zeroed by the OS (or more precisely they are copy-on-write images of a fixed page of zeroes known as the 'global zero page'). But if malloc() happens to use memory already allocated to the application which has since been freed (rather than expanding the heap) you may well see non-zero data. Note the zeroing of pages provided by the OS is an OS specific trait (primarily there for security so that one process doesn't end up with pages that happen to have data from another process), and is not mandated by the C standard.

You asked for a source for get_free_page zeroing the page: that says 'get_free_page() takes one parameter, a priority. ... It takes a page off of the free_page_list, updates mem_map, zeroes the page and returns the physical address of the page.' Here's another post that explains it well, and also explains why using calloc() is better than malloc()+memset().

Note that you aren't checking the entire allocated chunk for zero. You want something like this (untested):

int n;
char nonzero=0;
char *buffer=malloc(sizeof(char)*chunk);     
if(buffer==NULL){
    return -1;
}
for (n = 0; n<chunk; n++)
    nonzero = nonzero || buffer[n];  
printf("\nDATA=%s\n",nonzero?"nonzero":"zero");


回答2:

You're absolutely correct; this behaviour is not guaranteed by the C language standard.

What you're observing could just be chance (you're only checking a couple of bytes in each allocation), or it could be an artifact of how your OS and C runtime library are allocating memory.



回答3:

With this statement:

printf("\nDATA=%d",*buffer);

You only check the first sizeof(short) amount of bytes that have just been malloc()'ed (typically two (2) bytes).

Furthermore, the first time you may get lucky of getting all zeroes but after having had your program execute (and use) the heap memory then the contents-after-malloc() will be undefined.



回答4:

the memory allocation function: calloc() will return a pointer to the 'new area and set all the bytes to zero.

The memory allocation function: realloc() will return a pointer to a (possibly new) area and have copied the bytes from the old area. The new area will be the 'new' requested length

The memory allocation function malloc will return a pointer to the new area but will not set the bytes to any specific value



回答5:

The values that are being return are all 0 (zero).

But that's not guaranteed. It's because you're just running your program. If you malloc, random fill, and free a lot, you'll start noticing the previously freed memory is being reused, so you'll start to get non-zero chunks in your mallocs.



回答6:

Yes you are right malloc() doesn't zero-initialize values. It arbitrarily pulls the amount of memory it's told to allocate from the heap, which essentially means there could be anything stored already within. You should therefore use malloc() only, where you're certain, that you are going to set it to a value. If you're going to do arithmetic with it right out of the box you might get some fishy results (I have already several times personally experienced this; you're going to have functional code with sometimes crazy output).

So set stuff you're not setting to a value to zero with memset(). Or my advise is to use calloc(). Calloc, other than malloc, does zero-initialize values. And is as far as I know faster than the combination of malloc() and memset() on the other hand malloc alone is faster than calloc. So try to find the fastest version possible at point of issue by keeping you're memory in form.

Look also at this post here: MPI matrix-vector-multiplication returns sometimes correct sometimes weird values. The question was a different one, but the cause the same.