I have seen many times that std::string::operator[]
does not do any bounds checking. Even What is the difference between string::at and string::operator[]?, asked in 2013, the answers say that operator[]
does not do any bounds checking.
My issue with this is if I look at the standard (in this case draft N3797) in [string.access] we have
const_reference operator[](size_type pos) const; reference operator[](size_type pos);
- Requires:
pos <= size()
.- Returns:
*(begin() + pos)
ifpos < size()
. Otherwise, returns a reference to an object of typecharT
with valuecharT()
, where modifying the object leads to undefined behavior.- Throws: Nothing.
- Complexity: constant time.
This leads me to believe that operator[]
has to do some sort of bounds checking to determine if it needs to return a element of the string or a default charT
. Is this assumption correct and operator[]
is now required to do bounds checking?
The wording is slightly confusing, but if you study it in detail you'll find that it's actually very precise.
It says this:
[]
is either = n or it's < n.charT()
(i.e. the null character).But no rule is defined for when you break the precondition, and the check for = n can be satisfied implicitly (but isn't explicitly mandated to be) by actually storing a
charT()
at position n.So implementations don't need to perform any bounds checking… and the common ones won't.
This operator of standard containers emulates the behavior of the operator [] of ordinary arrays. So it does not make any checks. However in the debug mode the corresponding library can provide this checking.
If you want to check the index then use member function
at()
instead.First, there is a requires clause. If you violate the requires clause, your program behaves in an undefined manner. That is
pos <= size()
.So the language only defines what happens in that case.
The next paragraph states that for
pos < size()
, it returns a reference to an element in the string. And forpos == size()
, it returns a reference to a default constructedcharT
with valuecharT()
.While this may look like bounds checking, in practice what actually happens is that the
std::basic_string
allocates a buffer one larger than asked and populates the last entry with acharT()
. Then[]
simply does pointer arithemetic.I have tried to come up with a way to avoid that implementation. While the standard does not mandate it, I could not convince myself an alternative exists. There was something annoying with
.data()
that made it difficult to avoid the single buffer.http://en.cppreference.com/w/cpp/string/basic_string/operator_at
(Emphasis mine).
If you want bounds checking, use std::basic_string::at
The standard imply the implementation needs to provide bounds checking because it basically describes what an unchecked array access does.
If you access within bounds, it's defined. If you step outside, you trigger undefined behavior.
No it doesn't. With the precondition
it can just ASSUME that it can always return an element of the string. If this condition isn't met: Undefined behaviour.
The
operator[]
will likely just increment the pointer from the start of the string by pos. If the string is shorter, well then it just returns a reference to the data behind the string, whatever it might be. Like a classic out of bounds in simple C arrays.To fullify the case of where
pos == size()
it could just have allocated an extracharT
at the end of its internal string data. So just incrementing the pointer without any checks, would still deliver the stated behaviour.