Wrapping linked lists in iterators

2020-04-08 14:25发布

问题:

A set of APIs that I commonly use follow a linked-list pattern:

struct SomeObject
{
    const char* some_value;
    const char* some_other_value;
    SomeObject* next;    
}

LONG GetObjectList( SomeObject** list );
void FreeObjectList( SomeObject* list );

This API is not mine and I cannot change it.

So, I'd like to encapsulate their construction/destruction, access, and add iterator support. My plan is to do something like this:

/// encapsulate access to the SomeObject* type
class MyObject
{
public:
    MyObject() : object_( NULL ) { };
    MyObject( const SomeObject* object ) : object_( object ) { };
    const char* SomeValue() const 
    { 
        return NULL != object_ ? object_->some_value : NULL; 
    };
    const char* SomeValue() const 
    { 
        return NULL != object_ ? object_->some_other_value : NULL; 
    };

private:
    SomeObject* object_;
}; // class MyObject

bool operator==( const MyObject& i, const MyObject& j )
{
    return // some comparison algorithm.
};

/// provide iterator support to SomeObject*
class MyObjectIterator 
    : public boost::iterator_adaptor< MyObjectIterator, 
                                      MyObject*,
                                      boost::use_default,
                                      boost::forward_traversal_tag >
{
public:
    // MyObjectIterator() constructors

private:
    friend class boost::iterator_core_access;

    // How do I cleanly give the iterator access to the underlying SomeObject*
    // to access the `next` pointer without exposing that implementation detail
    // in `MyObject`?
    void increment() { ??? }; 
};

/// encapsulate the SomeObject* creation/destruction
class MyObjectList
{
public:
    typedef MyObjectIterator const_iterator;

    MyObjectList() : my_list_( MyObjectList::Create(), &::FreeObjectList )
    {
    };

    const_iterator begin() const
    {
        // How do I convert a `SomeObject` pointer to a `MyObject` reference?
        return MyObjectIterator( ??? );
    };

    const_iterator end() const
    {
        return MyObjectIterator();
    };

private:
    static SomeObject* Create()
    {
        SomeObject* list = NULL;
        GetObjectList( &list );
        return list;
    };

    boost::shared_ptr< void > my_list_;
}; // class MyObjectList

My two questions are:

  1. How do I cleanly give MyObjectIterator access to the underlying SomeObject to access the next pointer in the linked list without exposing that implementation detail in MyObject?

  2. In MyObjectList::begin(), how do I convert a SomeObject pointer to a MyObject reference?

Thanks, PaulH


Edit: The linked-list APIs I'm wrapping are not mine. I cannot change them.

回答1:

  1. You would make MyObjectIterator a friend of MyObject. I don't see any better way. And really I think it's reasonable that iterators get whatever special friend access is necessary for them to perform their duties.

  2. You don't seem to have considered how and where your MyObject instance are going to be stored. Or perhaps that's what this question is coming out of. It seems like you would have to have a separate linked list of MyObjects in your MyObjectList. Then at least MyObjectList::begin() can just grab the first MyObject of your internal linked list of them. And if the only operations that may modify or rearrange this list only ever happen through the iterators you provide, then you can problem keep these lists in synch without too much trouble. Otherwise, if there are functions in the API you're using that take the raw SomeObject linked list and manipulate it, then you may have trouble.

I can see why you've tried to design this scheme, but having separate MyObjects that point to SomeObjects even through SomeObjects themselves make up the real list structure.... That is not an easy way to wrap a list.

The simplest alternative is just to do away with MyObject completely. Let your iterators work against SomeObject instances directly and return those when dereferenced. Sure that does expose SomeObject to the outside, particularly its next member. But is that really a big enough problem to justify a more complex scheme to wrap it all up?

Another way to deal with might be to have MyObject privately inherit from SomeObject. Then each SomeObject instance can be downcast as a reference to a MyObject instance and given to the outside world that way, thus hiding the implemtation details of SomeObject and only exposing the desired public member functions. The standard probably doesn't guarantee this will work, but in practice since they'll likely have the exact same memory layouts you may be able to get away with it. However, I'm not sure that I would actually ever try such a thing in a real program unless absolutely necessary.

The last alternative I can think of is just to transfer the data into a list of more convenient data structures after being given to you by this API. And then of course transfer it back into a raw SomeObject list only if necessary to pass it back to the API. Just make your own SomeObjectData or whatever to store the strings and put them in a std::list. Whether or not this is actually feasible for you really depends on how this data is used in the context of the API you've mentioned. If there are other API functions that modify SomeObject lists and you need to use them frequently, then constantly converting SomeObject lists to and from std::list<SomeObjectData> could be annoying.



回答2:

First, of course, for real use, you almost certainly shouldn't be writing your own linked list or iterator at all. Second, good uses for linked lists (even one that's already written, debugged, etc.) are also pretty rare -- except in a few rather unusual circumstances, you should probably use something else (most often vector).

That said, an iterator is typically a friend (or nested class) of the class to which it provides access. It provides the rest of the world with an abstract interface, but the iterator itself has direct knowledge of (and access to) the internals of the linked list (or whatever container) to which it provides access). Here's a general notion:

// warning: This is really pseudo code -- it hasn't been tested, and would 
// undoubtedly require a complete rewrite to even compile, not to mention work.
template <class T>
class linked_list { 

public:
    class iterator;

private:
    // A linked list is composed of nodes.
    // Each node has a value and a pointer to the next node:    
    class node { 
        T value;
        node *next;
        friend class iterator;
        friend class linked_list;
    public:
        node(T v, node *n=NULL) : value(v), next(n) {}
    };

public:

    // An iterator gives access to the linked list.
    // Operations: 
    //      increment: advance to next item in list
    //      dereference: access value at current position in list
    //      compare: see if one iterator equals another    
    class iterator { 
        node *pos;
    public:
        iterator(node *p=NULL) : pos(p) {}
        iterator operator++() { 
            assert(pos); 
            pos = pos->next; 
            return *this;
        }
        T operator*() { return pos->value; }
        bool operator!=(iterator other) { return pos != other.pos; }
    };

    iterator begin() { return iterator(head); }
    iterator end()   { return iterator(); }

    void push_front(T value) { 
        node *temp = new node(value, head); 
        head = temp;
    }

    linked_list() : head(NULL) {}

private:
    node *head;
};

To work along with the algorithms in the standard library, you have to define quite a bit more than this tried to (e.g., typedefs like value_type and reference_type). This is only intended to show the general structure.



回答3:

My advice: Trash this and use an existing slist<> implementation. IIRC, it will be in C++1x, so your compiler(s) might already support it. Or it might be in boost. Or take it from someplace else.

Wherever you get it from, what you get has all these problems already solved, is likely very well tested, therefore bug-free and fast, and the code using it is easily recognizable (looking at it many of us would immediately see what it does, because it's been around for a while and it's going to to be part of the next standard).

The last time I wrote my own list class was before the STL became part of the C++ standard library.

Ok, since you're stuck with the API you have, here's something that might get you started:

class MyObjectList
{
public:
    typedef SomeObject value_type;
    // more typedefs

    class iterator {
    public:
        typedef SomeObject value_type;
        // more typedefs

        iterator(SomeObject* pObj = NULL)
                                    : pObj_(pObj) {}
        iterator& operator++()        {if(pObj_)pObj_ = pObj_->next;}
        iterator  operator++(int)     {iterator tmp(*this);
                                      operator++();
                                      return tmp;}
        bool operator==(const iterator& rhs) const
                                      {return pObj_ == rhs.pObj_;}
        bool operator!=(const iterator& rhs) const
                                      {return !operator==(rhs);}
              value_type& operator*() {return pObj_;}
    private:
        SomeObject* pObj_;
    };

    class const_iterator {
    public:
        typedef SomeObject value_type;
        // more typedefs
        const_iterator(const SomeObject* pObj = NULL)
                                    : pObj_(pObj) {}
        iterator& operator++()        {if(pObj_)pObj_ = pObj_->next;}
        iterator  operator++(int)     {iterator tmp(*this);
                                      operator++();
                                      return tmp;}
        bool operator==(const iterator& rhs) const
                                      {return pObj_ == rhs.pObj_;}
        bool operator!=(const iterator& rhs) const
                                      {return !operator==(rhs);}
        const value_type& operator*() {return pObj_;}
    private:
        const SomeObject* pObj_;
    };

    MyObjectList()                   : list_() {GetObjectList(&list_;);}
    ~MyObjectList()                    {FreeObjectList(list_);}
          iterator begin()             {return list_ ?       iterator(list_)
                                                     :       iterator();}
    const_iterator begin() const       {return list_ ? const_iterator(list_)
                                                     : const_iterator();}
          iterator end  ()             {return       iterator(getEnd_());}
    const_iterator end  () const       {return const_iterator(getEnd_());}

private:
    SomeObject* list_;

    SomeObject* getEnd_()
    {
        SomeObject* end = list_;
        if(list_)
            while(end->next)
                end = end->next;
        return end;
    }
};

Obviously, there's more to this (for example, I believe const and non-const iterators should be comparable, too), but that shouldn't be hard to add to this.



回答4:

From what you said, you probably have a BIG legacy code using your struct SomeObject types but you want to play nicely with new code and use iterators/stl containers.

If that's the case, you will not be able to (in an easy way) use your new created iterator in all the legacy code base, since that will be changing a lot of code, but, you can write a templated iterator that, if your structs follow the same pattern, having a next field, will work.

Something like this (I haven't tested nor compiled it, it's just an idea):

Suppose you have your struct:

struct SomeObject
{
    SomeObject* next;
}

You will be able to create something like this:

template <class T>
class MyIterator {
public:
//implement the iterator abusing the fact that T will have a `next` field, and it is accessible, since it's a struct
};

template <class T>
MyIterator<T> createIterator(T* object) {
   MyIterator<T> it(object);
   return it;
}

If you implement your iterator correctly, you will be able to use all the STL algorithms with your old structs.

PS.: If you're in a scenario of some legacy code with this kind of structs, I do too, and I implemented this workaround. It works great.



回答5:

I've seen some very good answers so far but I fear they might be "bare".

Before blindingly charge in let's examine the requirements.

  • As you noticed the structure is similar to a singly linked list, the C++0x standard defines such a structure, called forward_list. Read up its interface, yours should come close.
  • The iterator type will be ForwardIterator.

I would like to expose one very annoying fact though: who's responsible for the memory ?

I am worried because there's no copy facility in the interface you've provided, so should we disable the copy constructor and assignment operator of our new list class ?

Implementation is easy, there's enough on this page even though the iterator don't properly implement the iterator traits in general, but I would consider ditching this idea completely and move on to a better scheme:

  • class MyObject { public: ... private: SomeObject mData; };
  • Wrapping up GetObjectList and returning a deque<MyObject>, I guess the LONG returns the number of items ?