vector::insert(dst_iterator, src_begin, src_end)
(insert a range) can be optimized for random-access iterators to reserve the required capacity src_end - src_begin
first, then perform the copy.
The main question I have: Does the Standard also allow vector::insert
to avoid a capacity check for each copied element? (I.e. not using push_back
or similar on every element to be inserted)
I'll refer to avoiding this capacity check as "optimization of insert
".
What could go wrong: I can imagine an iterator with side effects when dereferenced:
Note: the Standard guarantees the iterators passed to insert
will be dereferenced exactly once (see end of question).
#include <vector>
#include <iterator>
#include <iostream>
template < typename T >
struct evil_iterator : std::iterator < std::random_access_iterator_tag, T >
{
using base = std::iterator < std::random_access_iterator_tag, T >;
std::vector<T>* evil_feedback;
typename std::vector<T>::iterator innocent_iterator;
evil_iterator( std::vector<T>* c,
typename std::vector<T>::iterator i )
: evil_feedback{c}
, innocent_iterator{i}
{}
void do_evil()
{
std::cout << "trying to do evil; ";
std::cout << "cap: " << evil_feedback->capacity() << ", ";
std::cout << "size: " << evil_feedback->size() << ", ";
// better not invalidate the iterators of `*evil_feedback`
// passed to the `insert` call (see example below)
if( evil_feedback->capacity() > evil_feedback->size() )
{
evil_feedback->push_back( T{} );
// capacity() might be == size() now
std::cout << "successful >:]" << std::endl;
}else
{
std::cout << "failed >:[" << std::endl;
}
}
T& operator*()
{
do_evil(); // <----------------------------------------
return *innocent_iterator;
}
// non-evil iterator member functions-----------------------
evil_iterator& operator++()
{
++innocent_iterator;
return *this;
}
evil_iterator& operator++(int)
{
evil_iterator temp(*this);
++(*this);
return temp;
}
evil_iterator& operator+=(typename base::difference_type p)
{
innocent_iterator += p;
return *this;
}
evil_iterator& operator-=(typename base::difference_type p)
{
innocent_iterator -= p;
return *this;
}
evil_iterator& operator=(evil_iterator const& other)
{
evil_feedback = other.evil_feedback;
innocent_iterator = other.innocent_iterator;
return *this;
}
evil_iterator operator+(typename base::difference_type p)
{
evil_iterator temp(*this);
temp += p;
return temp;
}
evil_iterator operator-(typename base::difference_type p)
{
evil_iterator temp(*this);
temp -= p;
return temp;
}
typename base::difference_type operator-(evil_iterator const& p)
{
return this->innocent_iterator - p.innocent_iterator;
}
bool operator!=(evil_iterator const& other) const
{ return innocent_iterator != other.innocent_iterator; }
};
Example:
int main()
{
std::vector<int> src = {3, 4, 5, 6};
std::vector<int> dst = {1, 2};
evil_iterator<int> beg = {&dst, src.begin()};
evil_iterator<int> end = {&dst, src.end()};
// explicit call to reserve, see below
dst.reserve( dst.size() + src.size() );
// using dst.end()-1, which stays valid during `push_back`,
// thanks to Ben Voigt pointing this out
dst.insert(dst.end()-1, beg, end); // <--------------- doing evil?
std::copy(dst.begin(), dst.end(),
std::ostream_iterator<int>{std::cout, ", "});
}
Questions:
- Can
vector::insert
be optimized to avoid a capacity check for each inserted element? - Is
evil_iterator
still a valid iterator? - If so, is
evil_iterator
evil, i.e. can it result in UB / non-complying behaviour ifinsert
is optimized as described above?
Maybe my do_evil
is not evil enough.. have no problems on clang++ 3.2 (using libstdc++):
Edit 2: Added the call to reserve
. Now, I'm doing evil :)
trying to do evil; cap: 6, size: 2, successful >:]
trying to do evil; cap: 6, size: 3, successful >:]
trying to do evil; cap: 6, size: 4, successful >:]
trying to do evil; cap: 6, size: 9, failed >:[
1, 3, 4, 5, 6, 0, 0, 135097, 2,
Edit: Why I think the optimization could break this:
- Consider
dst.size() == dst.capacity() == 2
at the beginning. - The call to
insert
requires a new capacity of 6. - The optimization enlarges the capacity to exactly 6, then begins to insert the elements by copying from the
src
iterators (beg
,end
). - This copying is done within a loop where no capacity checks occur. (That is the optimization.)
- During the process of copying, further elements are added to the vector (w/o invalidating the iterators), in
do_evil
. The capacity now is not sufficient any more to hold the rest of the elements to be copied.
Maybe you had to use reserve
in the example explicitly to force updating the observable capacity
before using do_evil
. Currently, insert
could reserve some capacity but change what capacity
returns (i.e. observable capacity) only after the copying is done.
What I've found in the Standard so far seems to allow the optimization of insert
:
[sequence.reqmts]/3
a.insert(p,i,j)
[...]Requires: T shall be EmplaceConstructible into X from *i.
For vector, if the iterator does not meet the forward iterator requirements (24.2.5), T shall also be MoveInsertable into X and MoveAssignable. Each iterator in the range [i,j) shall be dereferenced exactly once.
pre: i and j are not iterators into a. Inserts copies of elements in [i, j) before p
[vector.modifiers] on insert
1 Remarks: Causes reallocation if the new size is greater than the old capacity. If no reallocation happens, all the iterators and references before the insertion point remain valid. If an exception is thrown other than by the copy constructor, move constructor, assignment operator, or move assignment operator of T or by any InputIterator operation there are no effects. If an exception is thrown by the move constructor of a non-CopyInsertable T, the effects are unspecified.
2 Complexity: The complexity is linear in the number of elements inserted plus the distance to the end of the vector.