Iterators for circular structures

The following code shows what I currently have. It is an adapter for circular data-structures. The main function shows how it is used. This is all nice and fast but I really would like to have iterators over the structure defined by circ. All approaches up till now involved either some counting scheme (count the range if with the circulator, build an iterator that counts increments and decrements) or boolean values to check if the iterator has been moved to avoid begin and end being equal.

Are there some common solutions to adapt a circular structure to iterators? What other work-arounds are possible?

I would like to preserve the iteration speed as much as possible but am willing to make compromises here. I would value a fully conforming iterator over a minor speed penalty.

#include <cstddef> // nullptr
#include <iostream>
#include <boost/noncopyable.hpp>
#include <boost/operators.hpp>

// circular structure that we want to iterate over
struct circ : private boost::noncopyable {
  unsigned int i;
  circ* next;
  circ* prev;
};

// whacked up circulator, imagine some template cream here to make it
// generic, omitted to preserve sanity
struct circulator
  : public boost::incrementable<circulator>
  , public boost::decrementable<circulator>
  , public boost::equality_comparable<circulator, circulator>
  , public boost::dereferenceable<circulator, circ*>
{
  circulator()
    : c(nullptr) {}
  circulator(circ& c) : c(&c) {}

  bool operator==(const circulator& other) const {
    return this->c == other.c;
  }

  circulator& operator++() { c = c->next; return *this; }
  circulator& operator--() { c = c->prev; return *this; }

  explicit operator bool() const { return c; }

  circ& operator*() const { return *c; }

  circ* c;
};

int main()
{
  circ a, b, c, d;
  a.next = &b; a.prev = &a; a.i = 0;
  b.next = &c; b.prev = &a; b.i = 1;
  c.next = &d; c.prev = &b; c.i = 2;
  d.next = &a; d.prev = &c; d.i = 3;
  circulator begin{a}, end{a};
  if(begin)
    do {
      std::cout << begin->i << std::endl;
      begin = ++begin;
    } while(begin != end);

  return 0;
}

If need be I can add some of my previous approaches, but they are rather verbose and would add unnecessary bloat to the question.

EDIT: It would be nice if the resulting iterator would be bidirectional. Although I can give up this requirement.

回答1:

If it were me, I'd have operator++ notice the terminal condition, and set c to some sentinal value:

circulator(circ& c) : c(&c), start(&c) {}
circulator& operator++() { c = c->next; if(c==start) c=nullptr; return *this; }

Usage:

circulator begin{a}, end;
while(begin != end) {
  begin++;
}

Note that this usage defines the end iterator as holding a nullptr, which means that you can't do this:

circulator end;
--end;

回答2:

Usually, "circulator" means a circular iteration adapter for a linear structure. What you desire is really an inverse adapter: it takes a circular structure, and presents a linear iterator that has a beginning and an end - such concepts simply don't apply to circular iterators or circular structure.

So, to avoid confusion, I'll refer to the iterator as a circ_iterator. The true circulator for your circular structure is trivial, and must not care about any ends or beginnings.

The desired functionality can be had by tagging the iterator:

The idiomatic way of getting start/end iterators for type T is via begin and end in the namespace where T lives, or via member functions of the same name. The instantiation circ_iterator end{a} would be non-idiomatic. Instead, overload begin and end on circ&. Both return an iterator pointing to the argument. begin tags the iterator Default, and end tags the iterator End. See this question for details.
Only the end iterator is special, and all of the typical iterator semantics can be had by adding a small, three-valued tag to the iterator. Its values are:
- End: the iterator originated as a result of end;
- Inc: the iterator did not originate from end and has been most recently incremented;
- Default: otherwise.
An iterator obtained from end will retain its End tag forever. Otherwise, an iterator starts with the Default tag, and switches to Inc on increment, and back to Default on decrement.

Note that begin and end can never be the same, since there's no way for the circular container to have a zero size: a circ item always holds at least one data item. You could, of course represent an absence of a circ instance using a null iterator that compares equal to any other null iterator.

The increment operation is special, since the only legal way to approach the end iterator is by incrementing. When doing so, the following must hold:

You're not incrementing starting at end, since that's illegal.
Only after an increment you might be before the end, or at the end.

Thus, the iterators are identical when their pointers are identical, and:

the other iterator is not tagged End, or
this iterator is not tagged Default (it must be itself End or Inc - recently incremented).

Since the tag is small (2 bits wide), you can assume, or statically assert, that the circ type is aligned to 4 bytes and the platform-specific uintptr_t<->*circ conversions are "sane", and using the tagged pointer trick to retain the tag in the least significant bits of the pointer. I provide both the version that employs the tagged pointer trick, and one that doesn't.

Finally, it's much easier to implement iterators by deriving from boost::iterator_facade. I'm leaving the implementation of const_circ_iterator as an exercise to the reader. It is well documented.

The code compiles on MSVC2012 and also on LLVM 6.

First, let's deal with the tagged pointers - this is a very basic implementation, but will do for our purposes.

// https://github.com/KubaO/stackoverflown/tree/master/questions/circ-

iterator-9993713
#include <boost/iterator/iterator_facade.hpp>
#include <boost/noncopyable.hpp>
#include <boost/operators.hpp>
#include <limits>
#include <iostream>
#include <cassert>
#include <cstdint>
#include <algorithm>

template <typename T, bool merge_tag = false, typename tag_type = uint8_t> class tagged_ptr;

template <typename T, typename tag_type> class tagged_ptr<T, true, tag_type> {
    uintptr_t ptr;
    typedef std::numeric_limits<uintptr_t> lim;
    inline static uintptr_t ptr_of(T* p) {
        assert(tag_of(p) == 0);
        return uintptr_t(p);
    }
    inline static uintptr_t tag_mask() { return 3; }
    inline uintptr_t ptr_only() const { return ptr & (lim::max() - tag_mask()); }
    inline static tag_type tag_of(T* p) { return ((tag_type)(uintptr_t)p) & tag_mask(); }
    inline tag_type tag_only() const { return ptr & tag_mask(); }
public:
    tagged_ptr(T* p, tag_type t) : ptr(ptr_of(p) | t) { assert(t <= tag_mask()); }
    tagged_ptr(const tagged_ptr & other) : ptr(other.ptr) {}
    operator T*() const { return reinterpret_cast<T*>(ptr_only()); }
    T* operator->() const { return reinterpret_cast<T*>(ptr_only()); }
    tagged_ptr & operator=(T* p) { ptr = tag_only() | ptr_of(p); return *this; }
    tag_type tag() const { return tag_only(); }
    void set_tag(tag_type tag) { assert(tag <= tag_mask()); ptr = tag | ptr_only(); }
};

template <typename T, typename tag_type> class tagged_ptr<T, false, tag_type> {
    T* ptr;
    tag_type m_tag;
public:
    tagged_ptr(T* p, tag_type t) : ptr(p), m_tag(t) {}
    tagged_ptr(const tagged_ptr & other) : ptr(other.ptr), m_tag(other.m_tag) {}
    operator T*() const { return ptr; }
    T* operator->() const { return ptr; }
    tagged_ptr & operator=(T* p) { ptr = p; return *this; }
    tag_type tag() const { return m_tag; }
    void set_tag(tag_type tag) { m_tag = tag; }
};

The circ class can have some convenience constructors to make constructing circular lists easier, and to avoid the mistake you made in your question (a.prev = &a is wrong).

struct circ : private boost::noncopyable {
    unsigned int i;
    circ* next;
    circ* prev;
    explicit circ(int i) : i(i), next(nullptr), prev(nullptr) {}
    circ(int i, circ& prev) : i(i), next(nullptr), prev(&prev) {
        prev.next = this;
    }
    circ(int i, circ& prev, circ& next) : i(i), next(&next), prev(&prev) {
        prev.next = this;
        next.prev = this;
    }
};

The circ_iterator is then:

class circ_iterator;
circ_iterator end(circ& c);

class circ_iterator
        : public boost::iterator_facade<
        circ_iterator, circ, boost::bidirectional_traversal_tag
        >
{
    tagged_ptr<circ> c;
    enum { Default, Inc, End };
    friend class boost::iterator_core_access;
    friend circ_iterator end(circ&);
    struct end {};
    circ_iterator(circ& c_, end) : c(&c_, End) {}

    circ& dereference() const { return *c; }
    void increment() {
        c = c->next;
        if (c.tag() != End) c.set_tag(Inc);
    }
    void decrement() {
        c = c->prev;
        if (c.tag() != End) c.set_tag(Default);
    }
    bool equal(const circ_iterator & other) const {
        return this->c == other.c &&
                (other.c.tag() != End || this->c.tag() != Default);
    }
public:
    circ_iterator() : c(nullptr, Default) {}
    circ_iterator(circ& c_) : c(&c_, Default) {}
    circ_iterator(const circ_iterator& other) : c(other.c) {}
};

circ_iterator begin(circ& c) { return circ_iterator(c); }
circ_iterator end(circ& c) { return circ_iterator(c, circ_iterator::end()); }

Finally, a simple demonstration:

int main()
{
    circ a(0), b(1, a), c(2, b), d(3, c, a);
    assert(end(a) == end(a));
    assert(++--end(a) == end(a));
    for (auto it = begin(a); it != end(a); ++it) {
        std::cout << it->i << std::endl;
    }
    std::cout << "*" << std::endl;
    for (auto it = ++begin(a); it != --end(a); ++it) {
        std::cout << it->i << std::endl;
    }
    std::cout << "*" << std::endl;
    for (auto & c : a)
        std::cout << c.i << std::endl;
}

Output: