Strange Segmentation Fault in PyArray_SimpleNewFro

2019-07-26 02:12发布

问题:

My question is similar "in spirit" to Segmentation fault in PyArray_SimpleNewFromData

I have a C code that looks like this: (original code actually tests if malloc() returned NULL)

  1 #include <Python.h>
  2 #include <numpy/arrayobject.h>  // (Not sure if right import)
  3 #include <stdlib.h>
  4 #include <stdio.h>
  5 
  6 double *calculate_dW(npy_intp *dim_w) {
  7         int i;
  8         double* data = (double*)malloc(sizeof(double) * dim_w[0]);
  9         
 10         /* Inserts some dummy data */
 11         for (i = 0; i < dim_w[0]; i++)
 12                 data[i] = i;
 13         
 14         return data;
 15 }

And then a Cython code that wraps it inside a function:

  1 import cython
  2 import numpy as np
  3 cimport numpy as np
  4 
  5 cdef extern double *calculate_dW(np.npy_intp *dim_w)
  6 
  7 def run_calculate_dW(np.ndarray[np.npy_intp, ndim=1, mode="c"] dim_w):
  8         print("Will call calculate_dW")
  9         cdef double *dW = calculate_dW(&dim_w[0])
 10 
 11         print("Will call PyArray_SimpleNewFromData")
 12         ret = np.PyArray_SimpleNewFromData(
 13                 1,
 14                 &dim_w[0],
 15                 np.NPY_FLOAT64,
 16                 dW)
 17         print("Will print")
 18         print(ret)
 19         print("Will return")
 20         return ret

Which I test with

  # runTest.py
  1 import numpy as np
  2 import multiply
  3 a = np.array((10,)) # as expected, using `np.array(10)` won't work
  4 print a
  5 multiply.run_calculate_dW(a)

And get the following output

$ PYTHONPATH=build/lib.linux-x86_64-2.7/ python runTest.py 
[10]
Will call calculate_dW
Will call PyArray_SimpleNewFromData
Segmentation fault (core dumped)

(i.e., a SegFault in the call to PyArray_SimpleNewFromData() (if I replace it by, say, ret = 1, the Segmentation Fault vanishes). When debugging, I tried many things:

  • Changing the number of dimensions to 1;
  • Increasing the amount of memory allocated by malloc() (to guarantee I was not accessing anything I shouldn't);
  • Changing np.NPY_FLOAT32 to np.float32;
  • Changing the way I pass the "shape" of the new array.

I believe I am following precisely the documentation, as well as the answer to this other question. I don't seem to get any compiler error or warning.

Still, I do have noticed that all other codes around in the internet are using C (instead of Python) when they call PyArray_SimpleNewFromData. I tried returning a PyObject* from the C function, but couldn't get it to compile.

Also, I do get some "Using deprecated NumPy API, disable it by #defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" warning; but I have read I am safe to ignore them. (Cython Numpy warning about NPY_NO_DEPRECATED_API when using MemoryView )

Any suggestion? (also, any other way of creating a numpy array out of dW?)

回答1:

I think the issue is that you're passing a Python list as the second argument to PyArray_SimpleNewFromData when it expects a pointer to an integer. I'm a little surprised this compiles.

Try:

ret = np.PyArray_SimpleNewFromData(
                     4,
                     &dim_w[0], # pointer to first element
                     np.NPY_FLOAT64,
                     dW)

Note that I've also changed the type to NPY_FLOAT64 since that should match double.

I'd also change the definition of dim_w to

np.ndarray[np.NPY_INTP, ndim=1, mode="c"] dim_w

to ensure that the type of the array matches what numpy is expecting. This may also require changing the signature of calculate_dW to double *calculate_dW(intptr_t *dim_w) to match too.


Edit: A second issue is that you need to include the line

np.import_array()

in your Cython file (just at the top level, after your imports). This does some setup stuff for numpy. In principle I think the documentation recommends you always include it when doing cimport numpy. In practice it only sometimes matter, and this is one of those times.


(Answer is now tested)