Malloc() too small buffer

2019-07-12 02:30发布

问题:

I am asking this question to polish my knowledge. I wrote a test code the copy of below.

The size of char *ptr1 is malloc(1). I copy 5 word on 3 lines. that are total of 32 chars in length. The output is coming nicely. I went through the code on strncat() function. it is also seems not allocating extra memory for copy purpose. also I am not adding any null character at the end of the ptr1 char pointer.

How this can produce right result? then this is correct?

int main(void)
{
    char *name = "First Set";
    char *ptr1;

    ptr1 = malloc(1); // <<== here the memory is allocated only 1 byte
    joinWithMe(ptr1, name); // <<== 9 bytes are copied
    printf("PTR1 : %s\n", ptr1);  

    joinWithMe(ptr1, "-Second Set"); // <<== 11 bytes are copied
    printf("PTR1 : %s\n", ptr1);

    joinWithMe(ptr1, "-Third Set");  // <<== 10 bytes are copied
    printf("PTR1 : %s\n", ptr1);

    return 0;
}

void joinWithMe(char *me, const char *him)
{
    strncat(me, him, strlen(him));
}

.

OUTPUT

$ ./ctest
PTR1 : First Set
PTR1 : First Set-Second Set
PTR1 : First Set-Second Set-Third Set

回答1:

It provides the "right" result because you got lucky. The malloc() function does not make it possible to access memory, it finds memory and reserves you permission to use it, with a promise that no other code has that permission. The possibility to access it (if one knows of a nearby address) is there regardless. If the buffer is written to without permission, the results are undefined.

In this undefined case, if the memory beyond the buffer is not writable, you'll probably crash. If it is writable but no other code is making use of it (which is likely in your current case), it will appear to have worked without any issue. If some other code is making active use of that memory though, either that code will disrupt yours, or your code with disrupt the other, or both. The outcome of that disruption is uncertain (since it depends on what the code is doing), meaning undefined.



回答2:

There are two things nobody has mentioned you with regards to strncat. (I think others have done a good job explaining the buffer overrun issues.)

1.

Both string arguments to strncat (in fact, all string arguments to any str*** function) are assumed to be zero terminated strings. In your case the memory (byte) allocated by malloc apparently happens to be 0, so it is interpreted as an empty string. If the uninitialized memory contains some other value to start besides 0, then it would be interpreted as a string that's as long as needed to find a at the end of it. So in general, the buffer returned from malloc could be a very long string of garbage much longer than the amount of memory you allocated. If you want to allocate memory initialized to 0, use calloc.

2.

The last parameter to strncat is meant as a way to prevent buffer overruns by specifying how much space you have left in the buffer you have already allocated.

// allocate a string buffer
const size_t bufSz = 20;
char *pBuf = malloc(bufSz);

// initialize it with either a short or long string
bool shortOrLong = ...
strcpy(pBuf, shortOrLong ? "short_str" : "longer_string");

// append a string without buffer overrun
const char *pStr = "_hi_there";
strncat(pBuf, pStr, bufSz-strlen(pBuf)-1);

Depending on the value of shortOrLong, the contents of pBuf will be one of the following.

|0| | | | |5| | | | |10 | | | |15 | | | |20  <-- index
+=========+=========+=========+=========+
|                                       |    <-- before strcpy
+=========+=========+=========+=========+
|s|h|o|r|t|_|s|t|r|0|                   |    <-- before strncat
|s|h|o|r|t|_|s|t|r|_|h|i|_|t|h|e|r|e|0| |    <-- after strncat
+=========+=========+=========+=========+
|l|o|n|g|e|r|_|s|t|r|i|n|g|0            |    <-- before strncat
|l|o|n|g|e|r|_|s|t|r|i|n|g|_|h|i|_|t|h|0|    <-- after strncat
+=========+=========+=========+=========+

(I used 0 to indicate the null termination character, and blank (space) to indicate uninitialized values.)

Note that in the "short_str" case, the appended string fit into the buffer, so all of it was copied; in the "longer string" case, only part of the string fit, so that part was copied, then the string was still null terminated to keep it a valid C string without overrunning the buffer.