Python's itertools implement a chain iterator which essentially concatenates a number of different iterators to provide everything from single iterator.
Is there something similar in C++ ? A quick look at the boost libraries didn't reveal something similar, which is quite surprising to me. Is it difficult to implement this functionality?
Came across this question while investigating for a similar problem.
Even if the question is old, now in the time of C++ 11 and boost 1.54 it is pretty easy to do using the Boost.Range library. It features a join
-function, which can join two ranges into a single one. Here you might incur performance penalties, as the lowest common range concept (i.e. Single Pass Range or Forward Range etc.) is used as new range's category and during the iteration the iterator might be checked if it needs to jump over to the new range, but your code can be easily written like:
#include <boost/range/join.hpp>
#include <iostream>
#include <vector>
#include <deque>
int main()
{
std::deque<int> deq = {0,1,2,3,4};
std::vector<int> vec = {5,6,7,8,9};
for(auto i : boost::join(deq,vec))
std::cout << "i is: " << i << std::endl;
return 0;
}
In C++, an iterator usually doesn't makes sense outside of a context of the begin and end of a range. The iterator itself doesn't know where the start and the end are. So in order to do something like this, you instead need to chain together ranges of iterators - range is a (start, end) pair of iterators.
Takes a look at the boost::range documentation. It may provide tools for constructing a chain of ranges. The one difference is that they will have to be the same type and return the same type of iterator. It may further be possible to make this further generic to chain together different types of ranges with something like any_iterator, but maybe not.
I've written one before (actually, just to chain two pairs of iterators together). It's not that hard, especially if you use boost's iterator_facade
.
Making an input iterator (which is effectively what Python's chain
does) is an easy first step. Finding the correct category for an iterator chaining a combination of different iterator categories is left as an exercise for the reader ;-).
What you are essentially looking for is a facade iterator that abstracts away the traversing through several sequences.
Since you are coming from a python background I'll assume that you care more about flexibility rather than speed. By flexibility I mean the ability to chain-iterate through different sequence types together (vector, array, linked list, set etc....) and by speed I mean only allocating memory from the stack.
If this is the case then you may want to look at the any_iterator from adobe labs:
http://stlab.adobe.com/classadobe_1_1any__iterator.html
This iterator will give you the ability to iterate through any sequence type at runtime. To chain you would have a vector (or array) of 3-tuple any_iterators, that is, three any_iterators for each range you chain together (you need three to iterate forward or backward, if you just want to iterate forward two will suffice).
Let's say that you wanted to chain-iterate through a sequence of integers:
(Untested psuedo-c++ code)
typedef adobe::any_iterator AnyIntIter;
struct AnyRange {
AnyIntIter begin;
AnyIntIter curr;
AnyIntIter end;
};
You could define a range such as:
int int_array[] = {1, 2, 3, 4};
AnyRange sequence_0 = {int_array, int_array, int_array + ARRAYSIZE(int_array)};
Your RangeIterator class would then have an std::vector.
<code>
class RangeIterator {
public:
RangeIterator() : curr_range_index(0) {}
template <typename Container>
void AddAnyRange(Container& c) {
AnyRange any_range = { c.begin(), c.begin(), c.end() };
ranges.push_back(any_range);
}
// Here's what the operator++() looks like, everything else omitted.
int operator++() {
while (true) {
if (curr_range_index > ranges.size()) {
assert(false, "iterated too far");
return 0;
}
AnyRange* any_range = ranges[curr_range_index];
if (curr_range->curr != curr_range->end()) {
++(curr_range->curr);
return *(curr_range->curr);
}
++curr_range_index;
}
}
private:
std::vector<AnyRange> ranges;
int curr_range_index;
};
</code>
I do want to note however that this solution is very slow. The better, more C++ like approach is just to store all the pointers to the objects that you want operate on and iterate through that. Alternatively, you can apply a functor or a visitor to your ranges.
Not in the standard library. Boost might have something.
But really, such a thing should be trivial to implement. Just make yourself an iterator with a vector of iterators as a member. Some very simple code for operator++, and you're there.
Check Views Template Library (VTL). It may not provided 'chained iterator' directly. But I think it has all the necessary tools/templates available for implementing your own 'chained iterator'.
From the VTL Page:
A view is a container adaptor, that provides a container interface to
- parts of the data or
- a rearrangement of the data or
- transformed data or
- a suitable combination of the data sets
of the underlying container(s). Since views themselves provide the container interface, they can be easily combined and stacked. Because of template trickery, views can adapt their interface to the underlying container(s). More sophisticated template trickery makes this powerful feature easy to use.
Compared with smart iterators, views are just smart iterator factories.
No functionality exists in boost that implements this, to the best of my knowledge - I did a pretty extensive search.
I thought I'd implement this easily last week, but I ran into a snag: the STL that comes with Visual Studio 2008, when range checking is on, doesn't allow comparing iterators from different containers (i.e., you can't compare somevec1.end() with somevec2.end() ). All of a sudden it became much harder to implement this and I haven't quite decided yet on how to do it.
I wrote other iterators in the past using iterator_facade and iterator_adapter from boost, which are better than writing 'raw' iterators but I still find writing custom iterators in C++ rather messy.
If someone can post some pseudocode on how this could be done /without/ comparing iterators from different containers, I'd be much obliged.