What is the fundamental difference, if any, between a C++ std::vector and std::basic_string?
问题:
回答1:
basic_string doesn't call constructors and destructors of its elements. vector does.
swapping basic_string invalidates iterators (enabling small string optimization), swapping vectors doesn't.
basic_string memory may not be allocated continuously in C++03. vector is always continuous. This difference is removed in C++0x [string.require]:
The char-like objects in a basic_string object shall be stored contiguously
basic_string has interface for string operations. vector doesn't.
basic_string may use copy on write strategy (in pre C++11). vector can't.
Relevant quotes for non-believers:
[basic.string]:
The class template basic_string conforms to the requirements for a Sequence Container (23.2.3), for a Reversible Container (23.2), and for an Allocator-aware container (Table 99), except that basic_string does not construct or destroy its elements using allocator_traits::construct and allocator_- traits::destroy and that swap() for basic_string invalidates iterators. The iterators supported by basic_string are random access iterators (24.2.7).
回答2:
basic_string
gives compiler and standard library implementations, a few freedoms over vector:
The "small string optimization" is valid on strings, which allows implementations to store the actual string, rather than a pointer to the string, in the string object when the string is short. Something along the lines of:
class string { size_t length; union { char * usedWhenStringIsLong; char usedWhenStringIsShort[sizeof(char*)]; }; };
In C++03, the underlying array need not be contiguous. Implementing
basic_string
in terms of something like a "rope" would be possible under the current standard. (Though nobody does this because that would make the membersstd::basic_string::c_str()
andstd::basic_string::data()
too expensive to implement.)
C++11 now bans this behavior though.In C++03,
basic_string
allows the compiler/library vendor to use copy-on-write for the data (which can save on copies), which is not allowed forstd::vector
. In practice, this used to be a lot more common, but it's less common nowadays because of the impact it has upon multithreading. Either way though, your code cannot rely on whether or notstd::basic_string
is implemented using COW.
C++11 again now bans this behavior.
There are a few helper methods tacked on to basic_string
as well, but most are simple and of course could easily be implemented on top of vector
.
回答3:
The key difference is that std::vector
should keep its data in continuous memory, when std::basic_string
could not to. As a result:
std::vector<char> v( 'a', 3 );
char* x = &v[0]; // valid
std::basic_string<char> s( "aaa" );
char* x2 = &s[0]; // doesn't point to continuous buffer
//For example, the behavior of
std::cout << *(x2+1);
//is undefined.
const char* x3 = s.c_str(); // valid
On practice this difference is not so important.
回答4:
A vector is a data structure which simulates an array. Deep inside it is actually a (dynamic) Array.
The basic_string class represents a Sequence of characters. It contains all the usual operations of a Sequence, and, additionally, it contains standard string operations such as search and concatenation.
You can use vector to keep whatever data type you want std::vector<int> or <float> or even std::vector< std::vector<T> >
but a basic_string
can only be used for representing "text".
回答5:
The basic_string provides many string-specific comparison options. You are right in that the underlying memory management interface is very similar, but string contains many additional members, like c_str(), that would make no sense for a vector.
回答6:
One difference between std::string
and std::vector
is that programs may construct a string from a null-terminated string, whereas with vectors they cannot.
std::string a = "hello"; // okay
std::vector<char> b = "goodbye"; // compiler error
This often makes strings easier to work with.
回答7:
TLDR: string
s are optimized to only contain character primitives, vector
s can contain primitives or objects
The preeminent difference between vector
and string
is that vector
can correctly contain objects, string
works only on primitives. So vector
provides these methods that would be useless for a string
working with primitives:
- vector::emplace
- vector::emplace_back
- vector::~vector
Even extending string
will not allow it to correctly handle objects, because it lacks a destructor. This should not be viewed as a drawback, it allows significant optimization over vector
in that string
can:
- Do short string optimization, potentially avoiding heap allocation, with little to no increased storage overhead
- Use
char_traits
, one ofstring
's template arguments, to define how operations should be implemented on the contained primitives (of which onlychar
,wchar_t
,char16_t
, andchar32_t
are implemented: http://en.cppreference.com/w/cpp/string/char_traits)
Particularly relevant are char_traits::copy
, char_traits::move
, and char_traits::assign
obviously implying that direct assignment, rather than construction or destruction will be used which is again, preferable for primitives. All this specialization has the additional drawbacks to string
that:
- Only
char
,wchar_t
,char16_t
, orchar32_t
primitives types will be used. Obviously, primitives of sizes up to 32-bit, could use their equivalently sizedchar_type
: https://stackoverflow.com/a/35555016/2642059, but for primitives such aslong long
a new specialization ofchar_traits
would need to be written, and the idea of specializingchar_traits::eof
andchar_traits::not_eof
instead of just usingvector<long long>
doesn't seem like the best use of time. - Because of short string optimization, iterators are invalidated by all the operations that would invalidate a
vector
iterator, butstring
iterators are additionally invalidated bystring::swap
andstring::operator=
Additional differences in the interfaces of vector
and string
:
- There is no mutable
string::data
: Why Doesn't std::string.data() provide a mutable char*? string
provides functionality for working with words unavailable invector
:string::c_str
,string::length
,string::append
,string::operator+=
,string::compare
,string::replace
,string::substr
,string::copy
,string::find
,string::rfind
,string::find_first_of
,string::find_first_not_of
,string::flind_last_of
,string::find_last_not_of
,string::operator+
,string::operator>>
,string::operator<<
,string::stoi
,string::stol
,string::stoll
,string::stoul
,string::stoull
,string::stof
,string::stod
,string::stold
,stirng::to_string
,string::to_wstring
- Finally everywhere
vector
accepts arguments of anothervector
,string
accepts astring
or achar*
Note this answer is written against C++11, so string
s are required to be allocated contiguously.