how to return numpy.array from boost::python?

2019-01-06 14:26发布

问题:

I would like to return some data from c++ code as a numpy.array object. I had a look at boost::python::numeric, but its documentation is very terse. Can I get an example of e.g. returning a (not very large) vector<double> to python? I don't mind doing copies of data.

回答1:

UPDATE: the library described in my original answer (https://github.com/ndarray/Boost.NumPy) has been integrated directly into Boost.Python as of Boost 1.63, and hence the standalone version is now deprecated. The text below now corresponds to the new, integrated version (only the namespace has changed).

Boost.Python now includes a moderately complete wrapper of the NumPy C-API into a Boost.Python interface. It's pretty low-level, and mostly focused on how to address the more difficult problem of how to pass C++ data to and from NumPy without copying, but here's how you'd do a copied std::vector return with that:

#include "boost/python/numpy.hpp"

namespace bp = boost::python;
namespace bn = boost::python::numpy;

std::vector<double> myfunc(...);

bn::ndarray mywrapper(...) {
    std::vector<double> v = myfunc(...);
    Py_intptr_t shape[1] = { v.size() };
    bn::ndarray result = bn::zeros(1, shape, bn::dtype::get_builtin<double>());
    std::copy(v.begin(), v.end(), reinterpret_cast<double*>(result.get_data()));
    return result;
}

BOOST_PYTHON_MODULE(example) {
    bn::initialize();
    bp::def("myfunc", mywrapper);
}


回答2:

A solution that doesn't require you to download any special 3rd party C++ library (but you need numpy).

#include <numpy/ndarrayobject.h> // ensure you include this header

boost::python::object stdVecToNumpyArray( std::vector<double> const& vec )
{
      npy_intp size = vec.size();

     /* const_cast is rather horrible but we need a writable pointer
        in C++11, vec.data() will do the trick
        but you will still need to const_cast
      */

      double * data = size ? const_cast<double *>(&vec[0]) 
        : static_cast<double *>(NULL); 

    // create a PyObject * from pointer and data 
      PyObject * pyObj = PyArray_SimpleNewFromData( 1, &size, NPY_DOUBLE, data );
      boost::python::handle<> handle( pyObj );
      boost::python::numeric::array arr( handle );

    /* The problem of returning arr is twofold: firstly the user can modify
      the data which will betray the const-correctness 
      Secondly the lifetime of the data is managed by the C++ API and not the 
      lifetime of the numpy array whatsoever. But we have a simple solution..
     */

       return arr.copy(); // copy the object. numpy owns the copy now.
  }

Of course you might write a function from double * and size, which is generic then invoke that from the vector by extracting this info. You could also write a template but you'd need some kind of mapping from data type to the NPY_TYPES enum.



回答3:

It's a bit late, but after many unsuccessful tries I found a way to expose c++ arrays as numpy arrays directly. Here is a short C++11 example using boost::python and Eigen:

#include <numpy/ndarrayobject.h>
#include <boost/python.hpp>

#include <Eigen/Core>

// c++ type
struct my_type {
  Eigen::Vector3d position;
};


// wrap c++ array as numpy array
static boost::python::object wrap(double* data, npy_intp size) {
  using namespace boost::python;

  npy_intp shape[1] = { size }; // array size
  PyObject* obj = PyArray_New(&PyArray_Type, 1, shape, NPY_DOUBLE, // data type
                              NULL, data, // data pointer
                              0, NPY_ARRAY_CARRAY, // NPY_ARRAY_CARRAY_RO for readonly
                              NULL);
  handle<> array( obj );
  return object(array);
}



// module definition
BOOST_PYTHON_MODULE(test)
{
  // numpy requires this
  import_array();

  using namespace boost::python;

  // wrapper for my_type
  class_< my_type >("my_type")
    .add_property("position", +[](my_type& self) -> object {
        return wrap(self.position.data(), self.position.size());
      });

}

The example describes a "getter" for the property. For the "setter", the easiest way is to assign the array elements manually from a boost::python::object using a boost::python::stl_input_iterator<double>.



回答4:

Doing it using the numpy api directly is not necessarily difficult, but I use boost::multiarray regularly for my projects and find it convenient to transfer the shapes of the array between the C++/Python boundary automatically. So, here is my recipe. Use http://code.google.com/p/numpy-boost/, or better yet, this version of the numpy_boost.hpp header; which is a better fit for multi-file boost::python projects, although it uses some C++11. Then, from your boost::python code, use something like this:

PyObject* myfunc(/*....*/)
{
   // If your data is already in a boost::multiarray object:
   // numpy_boost< double, 1 > to_python( numpy_from_boost_array(result_cm) );
   // otherwise:
   numpy_boost< double, 1> to_python( boost::extents[n] );
   std::copy( my_vector.begin(), my_vector.end(), to_python.begin() );

   PyObject* result = to_python.py_ptr();
   Py_INCREF( result );

   return result;
}


回答5:

I looked at the available answers and thought, "this will be easy". I proceeded to spend hours attempting what seemed like a trivial examples/adaptations of the answers.

Then I implemented @max's answer exactly (had to install Eigen) and it worked fine, but I still had trouble adapting it. My problems were mostly (by number) silly, syntax mistakes, but additionally I was using a pointer to a copied std::vector's data after the vector seemed to be dropped off the stack.

In this example, a pointer to the std::vector is returned, but also you could return the size and data() pointer or use any other implementation that gives your numpy array access to the underlying data in a stable manner (i.e. guaranteed to exist):

class_<test_wrap>("test_wrap")
    .add_property("values", +[](test_wrap& self) -> object {
            return wrap(self.pvalues()->data(),self.pvalues()->size());
        })
    ;

For test_wrap with a std::vector<double> (normally pvalues() might just return the pointer without populating the vector):

class test_wrap {
public:
    std::vector<double> mValues;
    std::vector<double>* pvalues() {
        mValues.clear();
        for(double d_ = 0.0; d_ < 4; d_+=0.3)
        {
            mValues.push_back(d_);
        }
        return &mValues;
    }
};

The full example is on Github so you can skip the tedious transcription steps and worry less about build, libs, etc. You should be able to just do the following and get a functioning example (if you have the necessary features installed and your path setup already):

git clone https://github.com/ransage/boost_numpy_example.git
cd boost_numpy_example
# Install virtualenv, numpy if necessary; update path (see below*)
cd build && cmake .. && make && ./test_np.py

This should give the output:

# cmake/make output
values has type <type 'numpy.ndarray'>
values has len 14
values is [ 0.   0.3  0.6  0.9  1.2  1.5  1.8  2.1  2.4  2.7  3.   3.3  3.6  3.9]

*In my case, I put numpy into a virtualenv as follows - this should be unnecessary if you can execute python -c "import numpy; print numpy.get_include()" as suggested by @max:

# virtualenv, pip, path unnecessary if your Python has numpy
virtualenv venv
./venv/bin/pip install -r requirements.txt 
export PATH="$(pwd)/venv/bin:$PATH"

Have fun! :-)