Vector's new method data()
provides a const and non-const version.
However string's data()
method only provides a const version.
I think they changed the wording about std::string
so that the chars are now required to be contiguous (like std::vector
).
Was std::string::data
just missed? Or is the a good reason to only allow const access to a string's underlying characters?
note: std::vector::data
has another nice feature, it's not undefined behavior to call data()
on an empty vector. Whereas &vec.front()
is undefined behavior if it's empty.
In C++98/03 there was good reason to not have a non-const
data()
due to the fact that string was often implemented as COW. A non-constdata()
would have required a copy to be made if the refcount was greater than 1. While possible, this was not seen as desirable in C++98/03.In Oct. 2005 the committee voted in LWG 464 which added the const and non-const
data()
tovector
, and added const and non-constat()
tomap
. At that time,string
had not been changed so as to outlaw COW. But later, by C++11, a COWstring
is no longer conforming. Thestring
spec was also tightened up in C++11 such that it is required to be contiguous, and there's always a terminating null exposed byoperator[](size())
. In C++03, the terminating null was only guaranteed by the const overload ofoperator[]
.So in short a non-const
data()
looks a lot more reasonable for a C++11string
. To the best of my knowledge, it was never proposed.Update
was added
basic_string
in the C++1z working draft N4582 by David Sankel's P0272R1 at the Jacksonville meeting in Feb. 2016.Nice job David!
@Christian Rau
From the time the original Plauger (around 1995 I think)
string
class was STL-ized by the committee (turned into a Sequence, templatified),std::string
has always beenstd::vector
plus string-related stuff (conversion from/to 0-terminated, concatenation, ...), plus some oddities, like COW that's actually "Copy on Write and on non-const
begin()
/end()
/operator[]
".But ultimately a
std::string
is really astd::vector
under another name, with a slightly different focus and intent. So:std::vector
,std::string
has either a size data member or both start and end data members;std::vector
,std::string
does not care about the value of its elements, embedded NUL or others.std::string
is not a C string with syntax sugar, utility functions and some encapsulation, just likestd::vector<T>
is notT[]
with syntax sugar, utility functions and some encapsulation.Although I'm not that well-versed in the standard, it might be due to the fact that
std::string
doesn't need to contain null-terminated data, but it can and it doesn't need to contain an explicit length field, but it can. So changing the undelying data and e.g. adding a'\0'
in the middle might get the strings length field out of sync with the actual char data and thus leave the object in an invalid state.Historically, the string data has not been const because it would prevent several common optimizations, like copy-on-write (COW). This is now, IIANM, far less common, because it behaves badly with multithreaded programs.
BTW, yes they are now required to be contiguous:
Another reason might be to avoid code such as:
Or any other function that returns a preallocated char array.