Modifying a data structure while iterating over it

2020-02-06 05:39发布

问题:

What happens when you add elements to a data structure such as a vector while iterating over it. Can I not do this?

I tried this and it breaks:

int main() {
    vector<int> x = { 1, 2, 3 };

    int j = 0;
    for (auto it = x.begin(); it != x.end(); ++it) {
        x.push_back(j);
        j++;
        cout << j << " .. ";
    }
}

回答1:

Iterators are invalidated by some operations that modify a std::vector.

Other containers have various rules about when iterators are and are not invalidated. This is a post (by yours truly) with details.

By the way, the entrypoint function main() MUST return int:

int main() { ... }


回答2:

What happens when you add elements to a data structure such as a vector while iterating over it. Can I not to this?

The iterator would become invalid IF the vector resizes itself. So you're safe as long as the vector doesn't resize itself.

I would suggest you to avoid this.

The short explanation why resizing invalidates iterator:

Initially the vector has some capacity (which you can know by calling vector::capacity().), and you add elements to it, and when it becomes full, it allocates larger size of memory, copying the elements from the old memory to the newly allocated memory, and then deletes the old memory, and the problem is that iterator still points to the old memory, which has been deallocated. That is how resizing invalidates iterator.

Here is simple demonstration. Just see when the capacity changes:

std::vector<int> v;
for(int i = 0 ; i < 100 ; i++ )
{
  std::cout <<"size = "<<v.size()<<", capacity = "<<v.capacity()<<std::endl;
  v.push_back(i);
}       

Partial Output:

size = 0, capacity = 0
size = 1, capacity = 1
size = 2, capacity = 2
size = 3, capacity = 4
size = 4, capacity = 4
size = 5, capacity = 8
size = 6, capacity = 8
size = 7, capacity = 8
size = 8, capacity = 8
size = 9, capacity = 16
size = 10, capacity = 16

See the complete output here : http://ideone.com/rQfWe

Note: capacity() tells the maximum number of elements the vector can contain without allocating new memory, and size() tells the number of elements the vector currently containing.



回答3:

It's a bad idea in general, because if the vector is resized, the iterator will become invalid (it's wrapping a pointer into the vector's memory).

It's also not clear what your code is really trying to do. If the iterator somehow didn't become invalid (suppose it was implemented as an index), I'd expect you to have an infinite loop there - the end would never be reached because you're always adding elements.

Assuming you want to loop over the original elements, and add one for each, one solution would be to add the new elements to a second vector, and then concatenate that at the end:

vector<int> temp;

// ...

// Inside loop, do this:
temp.push_back(j);

// ...

// After loop, do this to insert all new elements onto end of x
x.insert(x.end(), temp.begin(), temp.end());


回答4:

It's not a good idea to do it.

You could think about the case where your vector would need to be resized after a push_back. It would then need to be moved to a bigger memory spot and your iterators would now be invalid.



回答5:

While you used vector as an example, there are other stl containers which are able to have elements pushed-back without invalidating iterators. Pushing back an element into a std::list doesn't require any re-allocation of existing elements as they aren't stored contiguously (lists instead comprise of nodes linked together by pointers to the next node), therefore iterators remain valid as the node they internally point to still resides at the same address.



回答6:

if you need to do it this way, you can reserve the maximum number of records you could add. this will stop the vector from needing to resize, and this should prevent crashes