This is inspired by this question and the comments on one particular answer in that I learnt that strncpy
is not a very safe string handling function in C and that it pads zeros, until it reaches n
, something I was unaware of.
Specifically, to quote R..
strncpy does not null-terminate, and does null-pad the whole remainder of the destination buffer, which is a huge waste of time. You can work around the former by adding your own null padding, but not the latter. It was never intended for use as a "safe string handling" function, but for working with fixed-size fields in Unix directory tables and database files. snprintf(dest, n, "%s", src) is the only correct "safe strcpy" in standard C, but it's likely to be a lot slower. By the way, truncation in itself can be a major bug and in some cases might lead to privilege elevation or DoS, so throwing "safe" string functions that truncate their output at a problem is not a way to make it "safe" or "secure". Instead, you should ensure that the destination buffer is the right size and simply use strcpy (or better yet, memcpy if you already know the source string length).
And from Jonathan Leffler
Note that strncat() is even more confusing in its interface than strncpy() - what exactly is that length argument, again? It isn't what you'd expect based on what you supply strncpy() etc - so it is more error prone even than strncpy(). For copying strings around, I'm increasingly of the opinion that there is a strong argument that you only need memmove() because you always know all the sizes ahead of time and make sure there's enough space ahead of time. Use memmove() in preference to any of strcpy(), strcat(), strncpy(), strncat(), memcpy().
So, I'm clearly a little rusty on the C standard library. Therefore, I'd like to pose the question:
What C standard library functions are used inappropriately/in ways that may cause/lead to security problems/code defects/inefficiencies?
In the interests of objectivity, I have a number of criteria for an answer:
- Please, if you can, cite design reasons behind the function in question i.e. its intended purpose.
- Please highlight the misuse to which the code is currently put.
- Please state why that misuse may lead towards a problem. I know that should be obvious but it prevents soft answers.
Please avoid:
- Debates over naming conventions of functions (except where this unequivocably causes confusion).
- "I prefer x over y" - preference is ok, we all have them but I'm interested in actual unexpected side effects and how to guard against them.
As this is likely to be considered subjective and has no definite answer I'm flagging for community wiki straight away.
I am also working as per C99.
How about the
malloc
family in general? The vast majority of large, long-lived programs I've seen use dynamic memory allocation all over the place as if it were free. Of course real-time developers know this is a myth, and careless use of dynamic allocation can lead to catastrophic blow-up of memory usage and/or fragmentation of address space to the point of memory exhaustion.In some higher-level languages without machine-level pointers, dynamic allocation is not so bad because the implementation can move objects and defragment memory during the program's lifetime, as long as it can keep references to these objects up-to-date. A non-conventional C implementation could do this too, but working out the details is non-trivial and it would incur a very significant cost in all pointer dereferences and make pointers rather large, so for practical purposes, it's not possible in C.
My suspicion is that the correct solution is usually for long-lived programs to perform their small routine allocations as usual with
malloc
, but to keep large, long-lived data structures in a form where they can be reconstructed and replaced periodically to fight fragmentation, or as largemalloc
blocks containing a number of structures that make up a single large unit of data in the application (like a whole web page presentation in a browser), or on-disk with a fixed-size in-memory cache or memory-mapped files.There's already one answer about
realloc
, but I have a different take on it. A lot of time, I've seen people writerealloc
when they meanfree
;malloc
- in other words, when they have a buffer full of trash that needs to change size before storing new data. This of course leads to potentially-large, cache-thrashingmemcpy
of trash that's about to be overwritten.If used correctly with growing data (in a way that avoids worst-case
O(n^2)
performance for growing an object to sizen
, i.e. growing the buffer geometrically instead of linearly when you run out of space),realloc
has doubtful benefit over simply doing your own newmalloc
,memcpy
, andfree
cycle. The only wayrealloc
can ever avoid doing this internally is when you're working with a single object at the top of the heap.If you like to zero-fill new objects with
calloc
, it's easy to forget thatrealloc
won't zero-fill the new part.And finally, one more common use of
realloc
is to allocate more than you need, then resize the allocated object down to just the required size. But this can actually be harmful (additional allocation andmemcpy
) on implementations that strictly segregate chunks by size, and in other cases might increase fragmentation (by splitting off part of a large free chunk to store a new small object, instead of using an existing small free chunk).I'm not sure if I'd say
realloc
encourages bad practice, but it's a function I'd watch out for.