How to use boost::python::iterator with return_int

2019-05-13 19:31发布

问题:

I have a class Type which cannot be copied nor it contains default constructor. I have second class A that acts as a set of the above classes. This second class gives access via iterators and my iterator has dereference operator:

class A {
    class iterator {
        [...]
      public:
        Type & operator*()
        { 
            return instance;
        }
      private:
        Type instance;
    }
    [...]
};

Now to expose that I wrote a boost::python code that looks like that:

class_<A>("A", [...])
    .def("__iter__", iterator<A, return_internal_reference<> >())
    .def("__len__", container_length_no_diff<A, A::iterator>)
;

After adding print messages to all iterator operations (construction, assignment, dereferences, destruction) for code Python like this:

for o in AInstance:
    print o.key

I get output (trimmed to important part):

construct 0xffffffff7fffd3e8
dereference: 0xffffffff7fffd3e8
destroy 0xffffffff7fffd3e8
get key 0xffffffff7fffd3e8

In above code those addresses are just addresses of instance member (or this in method call). First three lines are produced by iterator, the fourth line is printed by getter method in Type. So somehow boost::python wraps everything in such manner that it:

  1. creates iterator
  2. dereferences iterator and stores reference
  3. destroys iterator (and object it contains)
  4. uses reference obtained in step two

So clearly return_internal_reference does not behave like stated (note that it actually is just typedef over with_custodian_and_ward_postcall<>) where it should keep object as long as result of method call is referenced.

So my question is how do I expose such an iterator to Python with boost::python?

edit:

As it was pointed out it might not be clear: the original container does not contain objects of type Type. It contains some BaseType objects from which I am able to construct/modify Type object. So iterator in above example acts like transform_iterator.

回答1:

If A is a container that owns instances of Type, then consider having A::iterator contain a handle to Type instead of having a Type:

class iterator {
  [...]
private:
  Type* instance; // has a handle to a Type instance.
};

Instead of:

class iterator {
  [...]
private:
  Type instance; // has a Type instance.
};

In python, an iterator will contain a reference to the container on which it iterates. This will extend the lifespan of an iterable object, and prevent the iterable object from being garbage collected during iteration.

>>> from sys import getrefcount
>>> x = [1,2,3]
>>> getrefcount(x)
2 # One for 'x' and one for the argument within the getrefcount function.
>>> iter = x.__iter__()
>>> getrefcount(x)
3 # One more, as iter contains a reference to 'x'.

boost::python supports this behavior. Here is an example program, with Foo being a simple type that cannot be copied; FooContainer being an iterable container; and FooContainer::iterator being an iterator:

#include <boost/python.hpp>
#include <iterator>

// Simple example type.
class Foo
{
public:
  Foo()  { std::cout << "Foo constructed: " << this << std::endl; }
  ~Foo() { std::cout << "Foo destroyed:   " << this << std::endl; }
  void set_x( int x ) { x_ = x;    }
  int  get_x()        { return x_; }
private:
  Foo( const Foo& );            // Prevent copy.
  Foo& operator=( const Foo& ); // Prevent assignment.
private:
  int x_;  
};

// Container for Foo objects.
class FooContainer
{
private:
  enum { ARRAY_SIZE = 3 };
public:
  // Default constructor.
  FooContainer()
  {
    std::cout << "FooContainer constructed: " << this << std::endl;
    for ( int i = 0; i < ARRAY_SIZE; ++i )
    {
      foos_[ i ].set_x( ( i + 1 ) * 10 );
    }
  }

  ~FooContainer()
  {
    std::cout << "FooContainer destroyed:   " << this << std::endl;
  }

  // Iterator for Foo types.  
  class iterator
    : public std::iterator< std::forward_iterator_tag, Foo >
  {
    public:
      // Constructors.
      iterator()                      : foo_( 0 )        {} // Default (empty).
      iterator( const iterator& rhs ) : foo_( rhs.foo_ ) {} // Copy.
      explicit iterator(Foo* foo)     : foo_( foo )      {} // With position.

      // Dereference.
      Foo& operator*() { return *foo_; }

      // Pre-increment
      iterator& operator++() { ++foo_; return *this; }
      // Post-increment.     
      iterator  operator++( int )
      {
        iterator tmp( foo_ );
        operator++();
        return tmp;
      }

      // Comparison.
      bool operator==( const iterator& rhs ) { return foo_ == rhs.foo_; }
      bool operator!=( const iterator& rhs )
      {
        return !this->operator==( rhs );
      }

    private:
      Foo* foo_; // Contain a handle to foo; FooContainer owns Foo.
  };

  // begin() and end() are requirements for the boost::python's 
  // iterator< container > spec.
  iterator begin() { return iterator( foos_ );              }
  iterator end()   { return iterator( foos_ + ARRAY_SIZE ); }
private:
  FooContainer( const FooContainer& );            // Prevent copy.
  FooContainer& operator=( const FooContainer& ); // Prevent assignment.
private:
  Foo foos_[ ARRAY_SIZE ];
};

BOOST_PYTHON_MODULE(iterator_example)
{
  using namespace boost::python;
  class_< Foo, boost::noncopyable >( "Foo" )
    .def( "get_x", &Foo::get_x )
    ;
  class_< FooContainer, boost::noncopyable >( "FooContainer" )
    .def("__iter__", iterator< FooContainer, return_internal_reference<> >())
    ;
}

Here is the example output:

>>> from iterator_example import FooContainer
>>> fc = FooContainer()
Foo constructed: 0x8a78f88
Foo constructed: 0x8a78f8c
Foo constructed: 0x8a78f90
FooContainer constructed: 0x8a78f88
>>> for foo in fc:
...   print foo.get_x()
... 
10
20
30
>>> fc = foo = None
FooContainer destroyed:   0x8a78f88
Foo destroyed:   0x8a78f90
Foo destroyed:   0x8a78f8c
Foo destroyed:   0x8a78f88
>>> 
>>> fc = FooContainer()
Foo constructed: 0x8a7ab48
Foo constructed: 0x8a7ab4c
Foo constructed: 0x8a7ab50
FooContainer constructed: 0x8a7ab48
>>> iter = fc.__iter__()
>>> fc = None
>>> iter.next().get_x()
10
>>> iter.next().get_x()
20
>>> iter = None
FooContainer destroyed:   0x8a7ab48
Foo destroyed:   0x8a7ab50
Foo destroyed:   0x8a7ab4c
Foo destroyed:   0x8a7ab48


回答2:

I think the whole problem was that I did not fully understand what semantics should iterator class provide. It seems that value returned by iterator has to be valid as long as container exists, not iterator.

This means that boost::python behaves correctly and there are two solutions to that:

  • use boost::shared_ptr
  • return by value

A bit less efficient approaches than what I tried to do, but looks like there is no other way.

edit: I have worked out a solution (not only possible, but it seems to be working nicely): Boost python container, iterator and item lifetimes



回答3:

Here is relevant sample: https://wiki.python.org/moin/boost.python/iterator.
You can return iterator value by const / non const reference:

...
.def("__iter__"
     , range<return_value_policy<copy_non_const_reference> >(
           &my_sequence<heavy>::begin
         , &my_sequence<heavy>::end))

The idea is that, as you mentioned, you should bind to the container lifetime instead of the iterator lifetime for the return value.