Benefits of vector over string?

2019-02-02 11:49发布

问题:

This question is related to, but not quite the same as, this question.

Are there any benefits to using std::vector<char> instead of std::string to hold arbitrary binary data, aside from readability-related issues?

i.e. Are there any tasks which are easier/more efficient/better to perform with a vector compared to a string?

回答1:

Aside from readability (which should not be underestimated) I can think of a couple of minor performance/memory issues with using std::string over std::vector:

  • Some modern std::string implementations use the small string optimization. If you are storing data that's larger than the string's internal buffer, it becomes a pessimization, reducing the efficiency of copying, moving, and swap1 and increasing the sizeof() for no benefit.

  • An efficient std::string implementation will always allocate at least 1 more byte than the current size for storing a terminating null (not doing so requires extra logic in operator[] to cope with str[size()]).

I should stress that both of these issues are very minor; the performance cost of them will more than likely be lost in the background noise. But you did ask.


1Those operations require branching on size() if the small string optimization is being used, whereas they don't in a good std::vector implementation.



回答2:

Beyond readability, and ensuring another maintainer does not confuse the purpose of the std::string, there is not a lot of difference in function. You could of course consider char*/malloc as well, if efficiency is the only consideration.

One potential issue I can think of:

std::string defaults to storing <char>. If you later needed to handle another type (e.g. unsigned short) you might need to either:

  • Create your own typedef std::basic_string<unsigned short> (which moves you away from normal std::string handling)
  • Tentatively apply some reinterpret_cast logic in a setter.

With a vector you could simply change the container to a std::vector<unsigned short>.



回答3:

I think the only benefit you would gain from doing that would the ease of incrementing over the std::vector of characters, but even that can be done with an std::string.

You have to remember that even though std::string seems like an object, it can be accessed like an array, so even accessing specific parts of a string can be done without the use of a std::vector



回答4:

Ideally one would use vector<unsigned char> to store arbitrary binary data - but I think you already knew this - as you referred to the old question.

Other than that, using vector would definitely be more memory efficient, as string would add a terminating Nul character. Performance might also improve as the allocation mechanism is different for both - vectors guarantee contiguous memory!

Besides that, using a string would not be correct, as callers/users might inadvertently invoking some of the string methods, which could be a disaster.



回答5:

Yes, vector<char> indeed does have more capabilities over string.

Unlike string, vector<char> is guaranteed to preserve iterators, references, etc. during a swap operation. See: May std::vector make use of small buffer optimization?