Cython: unsigned int indices for numpy arrays give

2019-06-17 07:29发布

问题:

I converted to cython a python function by just adding some types and compiling it. I was getting small numerical differences between the results of the python and cython functions. After some work I found that the differences came from accessing a numpy array using unsigned int instead of int.

I was using unsigned int indices to speed up access according to: http://docs.cython.org/src/userguide/numpy_tutorial.html#tuning-indexing-further

anyway I thought it was harmless to use unsigned ints.

See this code:

cpdef function(np.ndarray[np.float32_t, ndim=2] response, max_loc):
    cdef unsigned int x, y   
    x, y = int(max_loc[0]), int(max_loc[1])
    x2, y2 = int(max_loc[0]), int(max_loc[1])
    print response[y,x], type(response[y,x]), response.dtype
    print response[y2,x2], type(response[y2,x2]), response.dtype   
    print 2*(response[y,x] - min(response[y,x-1], response[y,x+1]))
    print 2*(response[y2,x2] - min(response[y2,x2-1], response[y2,x2+1]))  

prints:

0.959878861904 <type 'float'> float32
0.959879 <type 'numpy.float32'> float32
1.04306024313
1.04306030273   

Why does this happen?!!! is it a bug?

Ok, as requested here is a SSCCE with the same types and values that I used in my original function

cpdef function():
    cdef unsigned int x, y  
    max_loc2 = np.asarray([ 15., 25.], dtype=float) 
    cdef np.ndarray[np.float32_t, ndim=2] response2 = np.zeros((49,49), dtype=np.float32)    
    x, y = int(max_loc2[0]), int(max_loc2[1])
    x2, y2 = int(max_loc2[0]), int(max_loc2[1])

    response2[y,x] = 0.959878861904  
    response2[y,x-1] = 0.438348740339
    response2[y,x+1] = 0.753262758255  


    print response2[y,x], type(response2[y,x]), response2.dtype
    print response2[y2,x2], type(response2[y2,x2]), response2.dtype
    print 2*(response2[y,x] - min(response2[y,x-1], response2[y,x+1]))
    print 2*(response2[y2,x2] - min(response2[y2,x2-1], response2[y2,x2+1]))  

prints

0.959878861904 <type 'float'> float32
0.959879 <type 'numpy.float32'> float32
1.04306024313
1.04306030273

I use python 2.7.3 cython 0.18 and msvc9 express

回答1:

I modified the example in the question to make it simpler to read the generated C source for the module. I'm only interested in seeing the logic that creates Python float objects instead of getting np.float32 objects from the response array.

I'm using pyximport to compile the extension module. It saves the generated C file in a subdirectory of ~/.pyxbld (probably %userprofile%\.pyxbld on Windows).

import numpy as np
import pyximport
pyximport.install(setup_args={'include_dirs': [np.get_include()]})

open('_tmp.pyx', 'w').write('''
cimport numpy as np
cpdef function(np.ndarray[np.float32_t, ndim=2] response, max_loc):
    cdef unsigned int p_one, q_one
    p_one = int(max_loc[0])
    q_one = int(max_loc[1])
    p_two = int(max_loc[0])
    q_two = int(max_loc[1])
    r_one = response[q_one, p_one]
    r_two = response[q_two, p_two]
''')

import _tmp
assert(hasattr(_tmp, 'function'))

Here's the generated C code for the section of interest (a bit reformatted to make it easier to read). It turns out that when you use C unsigned int index variables, the generated code grabs the data directly from the array buffer and calls PyFloat_FromDouble, which coerces it to double. On the other hand, when you use Python int index variables, it takes the generic approach. It forms a tuple and calls PyObject_GetItem. This way allows the ndarray to correctly honor the np.float32 dtype.

#define __Pyx_BufPtrStrided2d(type, buf, i0, s0, i1, s1) \
    (type)((char*)buf + i0 * s0 + i1 * s1)

  /* "_tmp.pyx":9
 *     p_two = int(max_loc[0])
 *     q_two = int(max_loc[1])
 *     r_one = response[q_one, p_one]             # <<<<<<<<<<<<<<
 *     r_two = response[q_two, p_two]
 */
  __pyx_t_3 = __pyx_v_q_one;
  __pyx_t_4 = __pyx_v_p_one;
  __pyx_t_5 = -1;

  if (unlikely(__pyx_t_3 >= (size_t)__pyx_bshape_0_response))
    __pyx_t_5 = 0;
  if (unlikely(__pyx_t_4 >= (size_t)__pyx_bshape_1_response))
    __pyx_t_5 = 1;

  if (unlikely(__pyx_t_5 != -1)) {
    __Pyx_RaiseBufferIndexError(__pyx_t_5);
    {
      __pyx_filename = __pyx_f[0];
      __pyx_lineno = 9;
      __pyx_clineno = __LINE__;
      goto __pyx_L1_error;
    }
  }

  __pyx_t_1 = PyFloat_FromDouble((
    *__Pyx_BufPtrStrided2d(
      __pyx_t_5numpy_float32_t *,
      __pyx_bstruct_response.buf,
      __pyx_t_3, __pyx_bstride_0_response,
      __pyx_t_4, __pyx_bstride_1_response)));

  if (unlikely(!__pyx_t_1)) {
    __pyx_filename = __pyx_f[0];
    __pyx_lineno = 9;
    __pyx_clineno = __LINE__;
    goto __pyx_L1_error;
  }

  __Pyx_GOTREF(__pyx_t_1);
  __pyx_v_r_one = __pyx_t_1;
  __pyx_t_1 = 0;

  /* "_tmp.pyx":10
 *     q_two = int(max_loc[1])
 *     r_one = response[q_one, p_one]
 *     r_two = response[q_two, p_two]             # <<<<<<<<<<<<<<
 */
  __pyx_t_1 = PyTuple_New(2);

  if (unlikely(!__pyx_t_1)) {
    __pyx_filename = __pyx_f[0];
    __pyx_lineno = 10;
    __pyx_clineno = __LINE__;
    goto __pyx_L1_error;
  }

  __Pyx_GOTREF(((PyObject *)__pyx_t_1));
  __Pyx_INCREF(__pyx_v_q_two);
  PyTuple_SET_ITEM(__pyx_t_1, 0, __pyx_v_q_two);
  __Pyx_GIVEREF(__pyx_v_q_two);
  __Pyx_INCREF(__pyx_v_p_two);
  PyTuple_SET_ITEM(__pyx_t_1, 1, __pyx_v_p_two);
  __Pyx_GIVEREF(__pyx_v_p_two);

  __pyx_t_2 = PyObject_GetItem(
    ((PyObject *)__pyx_v_response),
    ((PyObject *)__pyx_t_1));

  if (!__pyx_t_2) {
    __pyx_filename = __pyx_f[0];
    __pyx_lineno = 10;
    __pyx_clineno = __LINE__;
    goto __pyx_L1_error;
  }

  __Pyx_GOTREF(__pyx_t_2);
  __Pyx_DECREF(((PyObject *)__pyx_t_1));
  __pyx_t_1 = 0;
  __pyx_v_r_two = __pyx_t_2;
  __pyx_t_2 = 0;


回答2:

Playing around with this on my machine, I don't see a difference. I'm using the ipython notebook with cython magic:

In [1]:

%load_ext cythonmagic

In [12]:

%%cython

import numpy as np
cimport numpy as np

cpdef function(np.ndarray[np.float32_t, ndim=2] response, max_loc):
    cdef unsigned int x, y   
    x, y = int(max_loc[0]), int(max_loc[1])
    x2, y2 = int(max_loc[0]), int(max_loc[1])
    #return 2*(response[y,x] - min(response[y,x-1], response[y,x+1])), 2*(response[y2,x2] - min(response[y2,x2-1], response[y2,x2+1]))
    print response[y,x], type(response[y,x]), response.dtype
    print response[y2,x2], type(response[y2,x2]), response.dtype   
    print 2*(response[y,x] - min(response[y,x-1], response[y,x+1]))
    print 2*(response[y2,x2] - min(response[y2,x2-1], response[y2,x2+1])) 

In [13]:

a = np.random.normal(size=(10,10)).astype(np.float32)
m = [3,2]
function(a,m)

0.586090564728 <type 'float'> float32
0.586091 <type 'numpy.float32'> float32
4.39655685425
4.39655685425

The first pair of results, the difference is just the output precision of the print statement. What version of Cython are you using? The indices are extremely unlikely to effect the answer since it is just accessing a fixed length of memory that the data attribute of the numpy array is storing.