Keeping std::list iterators valid through insertio

2019-01-27 23:31发布

问题:

Note: This is not a question whether I should "use list or deque". It's a question about the validity of iterators in the face of insert().


This may be a simple question and I'm just too dense to see the right way to do this. I'm implementing (for better or worse) a network traffic buffer as a std::list<char> buf, and I'm maintaining my current read position as an iterator readpos.

When I add data, I do something like

buf.insert(buf.end(), newdata.begin(), newdata.end());

My question is now, how do I keep the readpos iterator valid? If it points to the middle of the old buf, then it should be fine (by the iterator guarantees for std::list), but typically I may have read and processed all data and I have readpos == buf.end(). After the insertion, I want readpos always to point to the next unread character, which in case of the insertion should be the first inserted one.

Any suggestions? (Short of changing the buffer to a std::deque<char>, which appears to be much better suited to the task, as suggested below.)

Update: From a quick test with GCC4.4 I observe that deque and list behave differently with respect to readpos = buf.end(): After inserting at the end, readpos is broken in a list, but points to the next element in a deque. Is this a standard guarantee?

(According to cplusplus, any deque::insert() invalidated all iterators. That's no good. Maybe using a counter is better than an iterator to track a position in a deque?)

回答1:

if (readpos == buf.begin())
{
    buf.insert(buf.end(), newdata.begin(), newdata.end());
    readpos = buf.begin();
}
else
{
    --readpos;
    buf.insert(buf.end(), newdata.begin(), newdata.end());
    ++readpos;
}

Not elegant, but it should work.



回答2:

From http://www.sgi.com/tech/stl/List.html

"Lists have the important property that insertion and splicing do not invalidate iterators to list elements, and that even removal invalidates only the iterators that point to the elements that are removed."

Therefore, readpos should still be valid after the insert.

However...

std::list< char > is a very inefficient way to solve this problem. Each byte you store in a std::list requires a pointer to keep track of the byte, plus the size of the list node structure, two more pointers usually. That is at least 12 or 24 bytes (32 or 64-bit) of memory used to keep track of a single byte of data.

std::deque< char> is probably a better container for this. Like std::vector it provides constant time insertions at the back however it also provides constant time removal at the front. Finally, like std::vector std::deque is a random-access container so you can use offsets/indexes instead of iterators. These three features make it an efficient choice.



回答3:

I was indeed being dense. The standard gives us all the tools we need. Specifically, the sequence container requirements 23.2.3/9 say:

The iterator returned from a.insert(p, i, j) points to the copy of the first element inserted into a, or p if i == j.

Next, the description of list::insert says (23.3.5.4/1):

Does not affect the validity of iterators and references.

So in fact if pos is my current iterator inside the list which is being consumed, I can say:

auto it = buf.insert(buf.end(), newdata.begin(), newdata.end());

if (pos == buf.end()) { pos = it; }

The range of new elements in my list is [it, buf.end()), and the range of yet unprocessed elements is [pos, buf.end()). This works because if pos was equal to buf.end() before the insertion, then it still is after the insertion, since insertion does not invalidate any iterators, not even the end.



回答4:

list<char> is a very inefficient way to store a string. It is probably 10-20 times larger than the string itself, plus you are chasing a pointer for every character...

Have you considered using std::dequeue<char> instead?

[edit]

To answer your actual question, adding and removing elements does not invalidate iterators in a list... But end() is still going to be end(). So you would need to check for that as a special case at the point where you insert the new element in order to update your readpos iterator.