Do I really need malloc?

I understand that malloc is used to dynamically allocate memory. In my code, I have the following function that I sometimes call:

int memory_get_log(unsigned char day, unsigned char date, unsigned char month){

    char fileName[11];
    unsigned long readItems, itemsToRead;
    F_FILE *file;

    sprintf(fileName, "%s_%u%u%u%s", "LOG", day, date, month, ".bin");

    file = f_open(fileName , "r");

    itemsToRead = f_filelength( fileName );

    //unsigned char *fileData = (unsigned char *) malloc(itemsToRead);
    unsigned char fileData[itemsToRead]; //here I am not using malloc

    readItems = f_read(fileData, 1, itemsToRead, file);

    transmit_data(fileData, itemsToRead);

    f_close(file);

    return 0;
}

As you may see, the number of items I read from the file can be different each time. The line unsigned char fileData[itemsToRead]; is used to read these variable sized files. I can see that I am allocating memory dynamically in some way. This function works fine. Do I really need to use malloc here? Is there anything wrong with the way I am declaring this array?

回答1:

Do I really need to use malloc here? Is there anything wrong with the way I am declaring this array?

It depends. VLA:s was removed as a mandatory component from C11, so strictly speaking, you are using compiler extensions, thus reducing the portability. In the future, VLA:s might (The chance is probably extremely low) get removed from your compiler. Maybe you also want to recompile the code on a compiler without support for VLA:s. The risk analysis about this is up to you.

Another problem is if the allocation fails. If you're using malloc, you have a chance to recover from this, but if you're only going to do something like this:

unsigned char *fileData = malloc(itemsToRead);
if(!fileData)
    exit(EXIT_FAILURE);

That is, just exit on failure and not trying to recover, then it does not really matter. At least not from a pure recovery point of view.

But also, although the C standard does not impose any requirement that VLA:s end up on the stack or heap, as far as I know it's pretty common to put them on the stack. This means that the risk of failing the allocation due to insufficient available memory is much, much higher. On Linux, the stack is usually 8MB and on Windows 1MB. In almost all cases, the available heap is much higher. The declaration char arr[n] is basically the same as char *arr = alloca(n) with the exception of how the sizeof operator works.

While I can understand that you might want to use the sizeof operator on a VLA sometimes, I find it very hard to find a real need for it. Afterall, the size can never change, and the size is known when you do the allocation. So instead of:

int arr[n];
...
for(int i=0; i<sizeof(arr), ...

Just do:

const size_t size = n;
int arr[size];
...
for(int i=0; i<size; ...

VLA:s are not a replacement for malloc. They are a replacement for alloca. If you don't want to change a malloc to an alloca, then you should not change to a VLA either.

Also, in many situations where a VLA would seem to bee a good idea, it is ALSO a good idea to check if the size is below a certain limit, like this:

int foo(size_t n)
{
    if(n > LIMIT) { /* Handle error */ }
    int arr[n];
    /* Code */
}

That would work, but compare it to this:

int foo(size_t n)
{
    int *arr = malloc(n*sizeof(*arr));
    if(!arr) { /* Handle error */ }
    /* Code */
    free(arr);
}

You did not really make things that much easier. It's still an error check, so the only thing you really got rid of was the free call. I might also add that it's a MUCH higher risk that a VLA allocation fails due to the size being too big. So if you KNOW that the size is small, the check is not necessary, but then again, if you KNOW that it is small, just use a regular array that will fit what you need.

However, I will not deny that there are some advantages of VLA:s. You can read about them here. But IMO, while they have those advantages they are not worth it. Whenever you find VLA:s useful, I would say that you should at least consider switching to another language.

There are many things to consider, but I would avoid using VLA:s. If you ask me, the biggest risk with them is that since they are so easy to use, people become careless with them. For those few cases where I find them suitable, I would use alloca instead, because then I don't hide the dangers.

Short summary

VLA:s are not required by C11 and later, so strictly speaking, you're relying on compiler extensions.
VLA:s are syntactic sugar (Not 100% correct, especially when dealing with multidimensional arrays) for alloca and not malloc. So don't use them instead of malloc. With the exception of how sizeof work on a VLA, they offer absolutely no benefit at all except for a somewhat simpler declaration.
VLA:s are (usually) stored on the stack while allocations done by malloc are (usually) stored on the heap, so a big allocation has a much higher risk to fail.
You cannot check if a VLA allocation failed or not, so it can be a good idea to check if the size is too big in advance. But then we have an error check just as we do with checking if malloc returned NULL.

This function works fine.

No it does not. It has undefined behavior. As pointed out by Jonathan Leffler in comments, the array fileName is too short. It would need to be at least 12 bytes to include the \0-terminator. You can make this a bit safer by changing to:

snprintf(fileName, 
         sizeof(fileName), 
         "%s_%u%u%u%s", 
         "LOG", day, date, month, ".bin");

In this case, the problem with the too small array would manifest itself by creating a file with extension .bi instead of .bin which is a better bug than undefined behavior, which is the current case.

You also have no error checks in your code. I would rewrite it like this. And for those who thinks that goto is bad, well, it usually is, but error handling is both practical and universally accepted among experienced C coders. Another common use is breaking out of nested loops, but that's not applicable here.

int memory_get_log(unsigned char day, unsigned char date, unsigned char month){

    char fileName[12];
    unsigned long readItems, itemsToRead;
    int ret = 0;

    F_FILE *file;

    snprintf(fileName, 
             sizeof(fileName), 
             "%s_%u%u%u%s", "LOG", 
             day, date, month, ".bin");

    file = f_open(fileName , "r");
    if(!file) { 
        ret = 1; 
        goto END;
    }

    itemsToRead = f_filelength( fileName );

    unsigned char *fileData = malloc(itemsToRead);
    if(!fileData) { 
        ret=2;
        goto CLOSE_FILE;
    }

    readItems = f_read(fileData, 1, itemsToRead, file);
    // Maybe not necessary. I don't know. It's up to you.
    if(readItems != itemsToRead) {  
        ret=3;
        goto FREE;
    }

    // Assuming transmit_data have some kind of error check
    if(!transmit_data(fileData, itemsToRead)) {  
        ret=4;
    }

FREE:
    free(fileData);
CLOSE_FILE:
    f_close(file);
END:
    return ret;
}

If a function only returns 0, then it's pointless to return anything. Declare it as void instead. Now I used the return value to make it possible for the caller to detect errors and the type of error.

回答2:

firstly, the line 'unsigned char fileData[itemsToRead]' asks for memory on stack, and this will be terrible mistake if file size is big. You should use 'malloc' to ask memory on heap. secondly, if file size is really big enough, you should coside to use virsual memory or dynamic load such as 'fseek' method.