C snprintf to append struct member char*

2019-08-20 16:48发布

问题:

When I print out result->value I get garbage (text from other memory areas), and a free(): invalid pointer

I am trying to safely append a string to an existing char* (the value member of the struct of which result is an instance)

unsigned const NAME_BUFFER=100;
unsigned const VALUE_BUFFER=1000;

typedef struct {
    char *name;
    int  ID;
    char *value;
} props;

…
static Bool
myfunc(props *result) 
{
    unsigned char *pointer;

    result->name=malloc(NAME_BUFFER);
    result->value=malloc(VALUE_BUFFER);

// not sure I need to do this, previous instances of the code produced
// output such as the quick...(null)someoutput
// which I thought might be because the member is empty the first time?
    sprintf(result->name,"%s","\0");
    sprintf(result->value,"%s","\0");

    …

    // in a loop which sets pointer, we want to safely append the value of pointer to the
    // value of the member called value

    snprintf(result->value,VALUE_BUFFER,"the quick...%s%s",result->value,pointer);

    …

    return False;
}

static void 
my_print_func() 
{
    props *result=malloc(sizeof(props));

    if(my_func(result))
        printf("%d\t%s",result->ID,result->value);
}

Changing the above to use sprintf doesn't cause these problems:

sprintf(result->value,"the quick...%s%s",result->value,pointer);

... other than the fact that it would happily try to insert more characters than allocated.

So what is the correct way of appending with sprintf (or variant) whilst also making sure we don't go out of bounds?

Ideally, it wouldn't involve temporary variables other constructs that need more than one line as I need to repeat this append in several places.

回答1:

This is undefined behaviour because the input arguments cannot be part of the output buffer (both in the case of snprintf and sprintf):

snprintf(result->value,VALUE_BUFFER,"the quick...%s%s",result->value,pointer);

This is specified telegraphically in the C standard:

§7.21.6.5/para 2: …If copying takes place between objects that overlap, the behavior is undefined. (the same sentence appears in §7.21.6.6/2 with respect to sprintf)

and also in man sprintf:

…the results are undefined if a call to sprintf(), snprintf(), vsprintf(), or vsnprintf() would cause copying to take place between objects that overlap (e.g., if the target string array and one of the supplied input arguments refer to the same buffer). (Linux version)

If it happens to produce the expected result in some circumstances, you were lucky (or unlucky, since the fluke could lead you to an invalid conclusion).

Although it is not incorrect, this line is ridiculously complicated for what it does:

sprintf(result->name,"%s","\0");

"\0" is treated as a zero-length string because strings are terminated by the first NUL character, so it only differs from "" by the fact that it uses up two bytes instead of one. But in any case, you could simply write:

result->name[0] = 0; /* Or ... `\0` if you like typing */

The standard library includes strcat and strncat for concatenating strings, but the "safe" version strncat only lets you specify a limit to the number of characters to append, not a limit to the total length of the string. So you need to keep track of the number of characters available yourself, and if you're going to do that, you might as well instead keep track of the position of the end of the string, which is where you want to copy the appended string, rather than searching for the end every time you do a concatenation. For this reason, str(n)cat are hardly ever the correct solution for string concatenation.

Here's a simple outline for concatenating multiple chunks to an output buffer:

size_t used = 0;
result->value = malloc(MAX_VALUE_LEN + 1);
for (...) { /* loop which produces the strings to append */
  ...
  /* append a chunk */
  size_t chunk_len = strlen(chunk);
  if (MAX_VALUE_LEN - used >= chunk_len) {
    memcpy(result->value + used, chunk, chunk_len);
    used += chunk_len;
  }
  else {
    /* Value is too long; return an error */
  }
}
result->value[used] = 0;

Not everyone will agree with my use of memcpy rather than strcpy; I did it because I already knew the length of the string to copy (which I had to figure out in order to check whether there was enough space), and it is usually more efficient to copy a known number of bytes than to copy bytes until you hit a NUL.

The use of memcpy forces me to explicitly NUL-terminate the result, but I would otherwise have had to insert a NUL at the beginning in case the loop didn't manage to append anything. In order to leave room for the NUL, I initially allocated MAX_VALUE_LEN + 1 bytes. However, in practice I would probably start with a small allocation and exponentially realloc if necessary, rather than imposing an artificial limit and wasting memory in the common case that the artificial limit was much greater than the memory actually needed.

If the size limit is not artificial -- that is, if there is some externality which constrains the length of the appended string, such as the size of an output display box -- then one might choose to simply truncate the string rather than throwing an error for over-size results:

size_t used = 0;
result->value = malloc(MAX_VALUE_LEN + 1);
for (...) { /* loop which produces the strings to append */
  ...
  /* append a chunk */
  size_t chunk_len = strlen(chunk);
  if (MAX_VALUE_LEN - used < chunk_len) {
    chunk_len = MAX_VALUE_LEN - used;
  }
  memcpy(result->value + used, chunk, chunk_len);
  used += chunk_len;
}
result->value[used] = 0;


回答2:

Here are some issues with your code. Type Bool should be defined, and False as well. Data pointed to by pointer is not initialized. In your call to sprintf you read and write to result->value which is an undefined behavior.

Here is a full working implementation, without undefined behavior, where the value of result->name is read and result of snprintf is written to result->value:

https://taas.trust-in-soft.com/tsnippet/t/de28e2a6