How to memset char array with null terminating cha

2019-02-12 02:53发布

问题:

What is the correct and safest way to memset the whole character array with the null terminating character? I can list a few usages:

...
char* buffer = new char [ARRAY_LENGTH];

//Option 1:             memset( buffer, '\0', sizeof(buffer) );
//Option 2 before edit: memset( buffer, '\0', sizeof(char*) * ARRAY_LENGTH );
//Option 2 after edit:  memset( buffer, '\0', sizeof(char) * ARRAY_LENGTH );
//Option 3:             memset( buffer, '\0', ARRAY_LENGTH );
...
  • Does any of these have significant advantage over other(s)?
  • What kind of issues can I face with with usages 1, 2 or 3?
  • What is the best way to handle this request?

回答1:

Options one and two are just wrong. The first one uses the size of a pointer instead of the size of the array, so it probably won't write to the whole array. The second uses sizeof(char*) instead of sizeof(char) so it will write past the end of the array. Option 3 is okay. You could also use this

memset( buffer, '\0', sizeof(char)*ARRAY_LENGTH );

but sizeof(char) is guaranteed to be 1.



回答2:

The idiomatic way is value-initializing the array:

char* buffer = new char [ARRAY_LENGTH]();

Option 1 only sets the first sizeof(char*) bytes to 0, or runs into undefined behavior if ARRAY_LENGHT < sizeof(char*).

Option 2 runs into undefined behavior because you're attempting to set more than ARRAY_LENGTH bytes. sizeof(char*) is almost certainly greater than 1.

Since this is C++ though (no new in C), I suggest you use a std::string instead.

For C (assuming malloc instead of new[]), you can use

memset( buffer, 0, ARRAY_LENGTH );


回答3:

Since the question keeps changing, I define:

1: memset( buffer, '\0', sizeof(buffer) );

2a: memset( buffer, '\0', sizeof(char*) * ARRAY_LENGTH );

2b: memset( buffer, '\0', sizeof(char) * ARRAY_LENGTH );

3: memset( buffer, '\0', ARRAY_LENGTH );

If the question is merely, "what is the correct way to call memset" rather than "what is the best way to zero this array", then either 2b or 3 is correct. 1 and 2a are wrong.

You can have a style war over 2b vs 3: whether to include the sizeof(char) or not -- some people leave it out because it's redundant (I usually do), other people put it in to create a kind of consistency with the same code setting an array of int. That is to say they always multiply a size by a number of elements, even though they know the size is 1. One possible conclusion is that the "safest" way to memset the array pointed to by buffer is:

std::memset(buffer, 0, sizeof(*buffer) * ARRAY_LENGTH);

This code remains correct if the type of buffer changes, provided of course that it continues to have ARRAY_LENGTH elements of whatever type that is, and provided that all-bits-zero remains the correct initial value.

Another option beloved of "C++ is not C" programmers, is:

/* never mind how buffer is allocated */
std::fill(buffer, buffer + ARRAY_LENGTH, 0);

If you care, you can then check for yourself whether or not your compiler optimizes this to the same code to which it optimizes the equivalent call to std::memset.

char *buffer = new char [ARRAY_LENGTH](); is nifty but almost useless in C++ in practice because you pretty much never allocate an array with new in the first place.

std::string buffer(ARRAY_LENGTH, 0); introduces a particular way of managing the buffer, which may or may not be what you want but often is. There's a lot to be said for char buffer[ARRAY_LENGTH] = {0}; in some cases.



回答4:

  • Does any of these have significant advantage over other(s)?
  • What kind of issues can I face with with usages 1, 2 or 3?

1st is wrong, because sizeof(buffer) == sizeof(char*).

2nd and 3rd are OK.

  • What is the best way to handle this request?

Why not just:

buffer[0] = '\0';

If this is a char array, why bother with the rest of the characters? With the first byte set to zero, you have the equivalent of "" in your buffer.

Of course, if you really insist on having all of buffer zeroed, use the answer with std::fill - this is the proper way. I mean std::fill(buffer, buffer + ARRAY_LENGTH, 0);.



回答5:

If you absolutely must use a raw array in C++ (it's a very ungood idea), do it like this:

char* buffer = new char [ARRAY_LENGTH]();

For C++ memset is generally the last refuge of the incompetent, although I learned within the last few months that for acceptable performance, with current tools, it's necessary to go down to that level when one implements one's own string class.

Instead of these raw arrays etc., which can appear to need memset, use e.g. std::string (for the above case), std::vector, std::array etc.



回答6:

Since C++ 11, I whould choose:

#include <array>

std::array<char, ARRAY_LENGTH> buffer{ '\0' };

buffer.fill('\0');


回答7:

Option 3: memset( buffer, '\0', ARRAY_LENGTH ): will give you only length of array but actually this parameter is total how much byte of memory.

Option 1: memset( buffer, '\0', sizeof(buffer) ): will give you wrong answer because, buffer is char*. sizeof(buffer) would not give you size of whole array only size of a pointer variable.

Option 2 is right.



回答8:

Well, personally I like option 3:

memset( buffer, '\0', ARRAY_LENGTH )

ARRAY_LENGTH is exactly what I would like to fill in the memory.