Which is a better c++ container for holding and accessing binary data?
std::vector<unsigned char>
or
std::string
Is one more efficient than the other?
Is one a more 'correct' usage?
Which is a better c++ container for holding and accessing binary data?
std::vector<unsigned char>
or
std::string
Is one more efficient than the other?
Is one a more 'correct' usage?
You should prefer std::vector
over std::string
. In common cases both solutions can be almost equivalent, but std::string
s are designed specifically for strings and string manipulation and that is not your intended use.
Both are correct and equally efficient. Using one of those instead of a plain array is only to ease memory management and passing them as argument.
I use vector because the intention is more clear than with string.
Edit: C++03 standard does not guarantee std::basic_string
memory contiguity. However from a practical viewpoint, there are no commercial non-contiguous implementations. C++0x is set to standardize that fact.
Is one more efficient than the other?
This is the wrong question.
Is one a more 'correct' usage?
This is the correct question.
It depends. How is the data being used? If you are going to use the data in a string like fashon then you should opt for std::string as using a std::vector may confuse subsequent maintainers. If on the other hand most of the data manipulation looks like plain maths or vector like then a std::vector is more appropriate.
For the longest time I agreed with most answers here. However, just today it hit me why it might be more wise to actually use std::string
over std::vector<unsigned char>
.
As most agree, using either one will work just fine. But often times, file data can actually be in text format (more common now with XML having become mainstream). This makes it easy to view in the debugger when it becomes pertinent (and these debuggers will often let you navigate the bytes of the string anyway). But more importantly, many existing functions that can be used on a string, could easily be used on file/binary data. I've found myself writing multiple functions to handle both strings and byte arrays, and realized how pointless it all was.
This is a comment to dribeas answer. I write it as an answer to be able to format the code.
This is the char_traits compare function, and the behaviour is quite healthy:
static bool
lt(const char_type& __c1, const char_type& __c2)
{ return __c1 < __c2; }
template<typename _CharT>
int
char_traits<_CharT>::
compare(const char_type* __s1, const char_type* __s2, std::size_t __n)
{
for (std::size_t __i = 0; __i < __n; ++__i)
if (lt(__s1[__i], __s2[__i]))
return -1;
else if (lt(__s2[__i], __s1[__i]))
return 1;
return 0;
}
As far as readability is concerned, I prefer std::vector. std::vector should be the default container in this case: the intent is clearer and as was already said by other answers, on most implementations, it is also more efficient.
On one occasion I did prefer std::string over std::vector though. Let's look at the signatures of their move constructors in C++11:
vector (vector&& x);
string (string&& str) noexcept;
On that occasion I really needed a noexcept move constructor. std::string provides it and std::vector does not.
If you just want to store your binary data, you can use bitset
which optimizes for space allocation. Otherwise go for vector
, as it's more appropriate for your usage.
Compare this 2 and choose yourself which is more specific for you. Both are very robust, working with STL algorithms ... Choose yourself wich is more effective for your task
Personally I prefer std::string because string::data() is much more intuitive for me when I want my binary buffer back in C-compatible form. I know that vector elements are guaranteed to be stored contiguously exercising this in code feels a little bit unsettling.
This is a style decision that individual developer or a team should make for themselves.