I observe an interesting problem with the Microsoft implementation of strncat
. It touches 1 byte beyond the source buffer. Consider the following code:
#include <stdio.h>
#include <stdlib.h>
#include <memory.h>
#include <string.h>
void main()
{
char dstBuf[1024];
char* src = malloc(112);
memset(src, 'a', 112);
dstBuf[0] = 0;
strncat(dstBuf, src, 112);
}
strncat
reads 1 byte after 112 byte block. So if you are unlucky enough to get allocation on an invalid page boundary, your application crashes. Large applications can crash intermittently in such places. (Note that such condition can be simulated with gflags PageHeap setting; block size has to be divisible by pointer size for proper alignment.)
Is this the expected behavior or a bug? Any links confirming that? (I read several descriptions of strncat
but they can be interpreted both ways depending on your initial set of mind...)
Update (to answer questions about evidence):
I apologize if it is not clear from the text above, but this is an experimental fact. I observe intermittent crashes in an application at strncat
reading address src+srcBufSize. In this small example run with gflags PageHeap on crash reproduces consistently (100%). So as far as I can see the evidence is very solid.
Update2 (info on compiler) MS Visual Studio 2005 Version 8.0.50727.867. Build platform: 64 bit release (no repro for 32 bit). OS used to repro the crash: Windows Server 2008 R2.
Update 3 The problem also reproduces with a binary built in MS Visual Studio 2012 11.0.50727.1
Update 4 Link to issue on Microsoft Connect; link to discussion on MSDN Forums
Update 5 The problem will be fixed in the next VS release. No fix is planned for old versions. See the "Microsoft Connect" link above.
The documentation for
strncat
states:Therefore, the implementation can assume that the
src
input parameter is in fact NUL-terminated, even if it is longer thancount
characters.For further confirmation, Microsoft's own documentation states:
On the other hand, the actual C standard states something like:
As pointed out in the comments below, this identifies the second parameter
s2
as an array and not a NUL-terminated string. However, this is still ambiguous with respect to the original question, because this documentation describes the ultimate effect ons1
, rather than the behaviour of the function when reading froms2
.This could of course be settled with respect to the specific Microsoft implementation by consulting the C Runtime Library source code.
English is an imperfect language, more so than C.
The documentation says "at most n characters" (my emphasis). There is no evidence to indicate that strncat copies more than 112 characters. What makes you believe it does?
The code of strncat might index past an offset of 112, but not actually reference offset 113 which could cause a storage fault. This ptr behavior is defined as acceptable in K&R.
Finally, again this is an English/reasoning problem, the documentation probably does say null terminated string. But really, isn't it redundant to say a string is null terminated? They are by definition, otherwise they would be an array of characters. So, the documentation is being vague and non-specific. The programmer is left to read between the lines. Software documentation are not legal tomes, they are descriptions that are meant to be understood by someone practiced in the art.
s2
is not a "string" instrncat(s1, s2, n)
.So if Microsoft is reading pass
n
bytes, it is not C11 compliant.C11 7.24.2.3.1
strcat()
mentions"appends a copy of the string pointed to by s2 (including the terminating null character) to the end of the string pointed to by s1".
C11 7.24.2.3.2
strncat
says"The strncat function appends not more than n characters (a null character and characters that follow it are not appended) from the array pointed to by s2 to the end of the string pointed to by s1. ... A terminating null character is always appended to the result"
Clearly in the
strncat
case,s2
is viewed as an "array" with a string-like limitations on how much is appended tos1
. Thus during the concatenation, this is no need to inspects2
more than what is absolutely needed. The final written\0
comes from code, nots2
.Don't know about the older C99 standard.