Recently (from one SO comment) I learned that std::remove
and std:remove_if
are stable. Am I wrong to think this is a terrible design choice since it prevents certain optimizations?
Imagine removing the first and fifth elements of a 1M std::vector
. Because of stability, we can't implement remove
with swap. Instead we must shift every remaining element. :(
If we weren't limited by stability we could (for RA and BD iter) practically have 2 iters, one from front, second from behind, and then use swap to bring to-be-removed items to end. I'm sure smart people could maybe do even better. My question is in general, not about specific optimization I'm talking about.
EDIT: please note that C++ advertizes the zero overhead principle, and also there are std::sort
and std::stable_sort
sort algorithms.
EDIT2:
optimization would be something like the following:
For remove_if
:
- bad_iter looks from the beginning for those elements for which the predicate returns true.
- good_iter looks from the end for those elements for which the predicate returns false.
when both have found what is expected they swap their elements. Termination is at good_iter <= bad_iter
.
If it helps, think of it like one iter in quick sort algorithm, but we don't compare them to a special element, but instead we use the above predicate.
EDIT3: I played around and tried to find worst case (worst case for remove_if
- notice how rarely the predicate would be true) and I got this:
#include <vector>
#include <string>
#include <iostream>
#include <map>
#include <algorithm>
#include <cassert>
#include <chrono>
#include <memory>
using namespace std;
int main()
{
vector<string> vsp;
int n;
cin >> n;
for (int i =0; i < n; ++i)
{ string s = "123456";
s.push_back('a' + (rand() %26));
vsp.push_back(s);
}
auto vsp2 = vsp;
auto remove_start = std::chrono::high_resolution_clock::now();
auto it=remove_if(begin(vsp),end(vsp), [](const string& s){ return s < "123456b";});
vsp.erase(it,vsp.end());
cout << vsp.size() << endl;
auto remove_end = std::chrono::high_resolution_clock::now();
cout << "erase-remove: " << chrono::duration_cast<std::chrono::milliseconds>(remove_end-remove_start).count() << " milliseconds\n";
auto partition_start = std::chrono::high_resolution_clock::now();
auto it2=partition(begin(vsp2),end(vsp2), [](const string& s){ return s >= "123456b";});
vsp2.erase(it2,vsp2.end());
cout << vsp2.size() << endl;
auto partition_end = std::chrono::high_resolution_clock::now();
cout << "partition-remove: " << chrono::duration_cast<std::chrono::milliseconds>(partition_end-partition_start).count() << " milliseconds\n";
}
C:\STL\MinGW>g++ test_int.cpp -O2 && a.exe
12345678
11870995
erase-remove: 1426 milliseconds
11870995
partition-remove: 658 milliseconds
For other usages, partition is bit faster, same or slower. Color me puzzled. :D
I assume you're asking about a hypothetical definition of stable_remove
to be what remove
currently is, and remove
to be implemented however the implementer thinks is best to give the correct values in any order. With an expectation that implementers will be able to improve on just doing exactly the same as stable_remove
.
In practice, the library can't easily do this optimization. It depends on the data, but you don't want to spend too long to work out how many elements will be removed before deciding on how to remove each one. For example you could do an extra pass to count them, but there are plenty of cases where that extra pass is inefficient. Just because an unstable remove is faster than stable for certain cases doesn't necessarily mean that an adaptive algorithm to choose between the two is a good bet.
I think the difference between remove
and sort
is that sorting is known to be a complicated problem with a lot of different solutions and trade-offs and tweaks. All "simple" sort algorithms are slow on average. Most standard algorithms are pretty simple, and remove
is one of them but sort
is not. I don't think it makes a lot of sense therefore to define stable_remove
and remove
as separate standard functions.
Edit: your edit with my tweak (similar to std::partition
but no need to keep the values on the right) seems pretty reasonable to me. It requires a bidirectional iterator, but there is precedent in the standard for algorithms that behave differently on different iterator categories, such as std::distance
. So it would be possible for the standard to define unstable_remove
that only requires a forward iterator, but does your thing if it gets a bidi iterator. The standard probably wouldn't lay out the algorithm, but it could have a phrase like "if the iterator is bidirectional, does at most min(k, n-k)
moves where k
is the number of elements removed", which would in effect force it. But note that the standard doesn't currently say how many moves remove_if
does, so I reckon that pinning this down simply wasn't a priority.
There is of course nothing stopping you from implementing your own unstable_remove
.
If we accept that the standard didn't need to specify an unstable remove, the question then comes down to whether the function it does define should have been called stable_remove
, anticipating a future remove
that behaves differently for bidi iterators, and might behave differently for forward iterators if some clever heuristic for doing an unstable remove ever becomes well enough known to be worth a standard function. I'd say not: it is not a disaster if the names of standard functions aren't completely regular. It could have been pretty disruptive to remove the guarantee of stability from the STL's remove_if
. Then the question becomes, "why didn't the STL call it stable_remove_if
", to which I can only answer that in addition to all the points made in all the answers, the STL design process was a sight quicker than the standardization process.
stable_remove
would also open a can of worms regarding other standard functions that could in theory have unstable versions. For a particularly silly example should copy
be called stable_copy
, just in case some implementation exists on which its demonstrably faster to reverse the order of elements while copying? Should copy
be called copy_forward
, so that the implementation can choose which of copy_backward
and copy_forward
is called by copy
according to which is faster? Part of the committee's job is to draw a line somewhere.
I think realistically the current standard is sensible, and it would be sensible to separately define a stable_remove
and a remove_with_some_other_constraints
, but remove_in_some_unspecified_way
just doesn't give the same opportunity for optimization that sort_in_some_unspecified_way
does. Introsort was invented in 1997, just as C++ was being standardized, but I don't imagine the research effort around remove
is quite what it was and is around sort
. I may be wrong, optimizing remove
might be the next big thing, and if so then the committee has missed a trick.
std::remove
is specified to work with forward iterators.
The approach with working with a pair of iterators, from beginning and from the end, would either increase the requirements for the iterators and thus decrease the utility of the function or violate/worsen asymptotic complexity guarantees.
To answer my own question >3 years later :)
Yes it was a "fail".
There is a proposal D0041R0 that would add unstable_remove .
One could argue that just because there is a proposal to add std::unstable_remove that it does not mean that std::remove was a mistake, but I disagree. :)