All over the web I see people use the erase/remove idiom for C++ vectors like so:
#include <vector> // the general-purpose vector container
#include <iostream>
#include <algorithm> // remove and remove_if
int main()
{
// initialises a vector that holds the numbers from 0-9.
std::vector<int> v = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
// removes all elements with the value 5
v.erase( std::remove( v.begin(), v.end(), 5 ), v.end() );
return 0;
}
That is, if I want to erase all elements matching some criteria (e.g. the number 5 from a vector of int
s), then I use std::remove
or std::remove_if
in conjunction with vector.erase
like so:
vector.erase( std::remove( vector.begin(), vector.end(), <some_value>), vector.end());
This works nicely in general; std::remove
(and remove_if
) will copy (or use move semantics in C++11) the elements that are to be deleted over to the end of the vector, so the vector from our previous example will now look like this:
{ 0, 1, 2, 3, 4, 6, 7, 8, 9, 5 };
With the element 5 bolded because it's been moved to the end.
Now, std::remove
will return an iterator to it, which we then use in erase
to clear the elements out. Nice.
But what about the following example?
int main()
{
// initialises an empty vector.
std::vector<int> v = {};
// removes all elements with the value 5
v.erase( std::remove( v.begin(), v.end(), 5 ), v.end() );
return 0;
}
This seems to work as expected (not erasing anything, not segfaulting, etc.) on all platforms I run it on, but I know that just because something is working, doesn't mean it's not undefined behavior.
The quick reference for vector.erase
says this (emphasis mine):
iterator erase (const_iterator first, const_iterator last);
first, last
are
Iterators specifying a range within the vector] to be removed:
[first,last)
. i.e., the range includes all the elements betweenfirst
andlast
, including the element pointed by first but not the one pointed bylast
. Member typesiterator
andconst_iterator
are random access iterator types that point to elements.
So is vector.erase(vector.end(),vector.end())
undefined behavior?
Here's what the quick reference says about exception safety:
If the removed elements include the last element in the container, no exceptions are thrown (no-throw guarantee). Otherwise, the container is guaranteed to end in a valid state (basic guarantee). An invalid
position
orrange
causes undefined behavior.
So, the answer, at least to me appears to be "YES", and this StackOverflow answer seems to support it.
Therefore, is the common idiom wrong?
Assuming it's undefined behavior, then any call to remove
could return an iterator to vector.end()
which should be checked before calling vector.erase
, and calling remove on an empty vector does seem to return vector.end
: (IDEOne for code below)
#include <iostream>
#include <algorithm>
#include <vector>
using namespace std;
int main() {
vector<int> myInts;
auto anIter = std::remove(myInts.begin(),myInts.end(),5);
if (anIter == myInts.end())
std::cout << "iterator = myInts.end()";
}
Finally, my question:
Should the actual remove/erase idiom be this?
auto endOfRangeIterator = std::remove(vector.begin(), vector.end(), <value>);
if (endOfRangeIterator != vector.end())
vector.erase(endOfRangeIterator, vector.end())
No. Because of the statement right next to the one you emphesized:
So,
vector.erase(vector.end(),vector.end())
does not attempt to erasevector.end()
because it is pointed to by parameterlast
.Granted, this definition is ambiguous and those statements can be interpreted as contradictory. The quoted wording is not used by the standard.
I think the more important in your cite is:
As we have found in comments, this citation from cpluspluc.com is incorrect. This will not violate the rules in case of
( v.end, v.end)
but will be incorrect in case ofbecause statement that contradicts itself with
cannot be a valid statement.
C++ Standard n3337 in § 23.2.2 Sequence containers requirements Table 100 specifies that
a.erase(q1,q2)
returnsiterator
. And note is:And this is what it says about the range
[i,j)
in § 24.2.1/7 Iterator requirementsThus to answer your questions
cplusplus.com is wrong in this case
No, no undefined behavior is triggered.
No, it is correct.
There is no need for this, though it is also fine.
Emphasis mine.
Further, the description of
erase
you cite is not the normative text in the standard. The standard has this to say (Table 100):This doesn't require that
q1
be dereferenceable. If [q1, q2) is an empty range (per 24.2.1/7), then no elements are in the range, and so none are erased.