Passing structured array to Cython, failed (I thin

2020-02-13 00:30发布

问题:

Suppose I have

a = np.zeros(2, dtype=[('a', np.int),  ('b', np.float, 2)])
a[0] = (2,[3,4])
a[1] = (6,[7,8])

then I define the same Cython structure

import numpy as np
cimport numpy as np

cdef packed struct mystruct:
  np.int_t a
  np.float_t b[2]

def test_mystruct(mystruct[:] x):
  cdef:
    int k
    mystruct y

  for k in range(2):
    y = x[k]
    print y.a
    print y.b[0]
    print y.b[1]

after this, I run

test_mystruct(a)

and I got error:

ValueError                                Traceback (most recent call last)
<ipython-input-231-df126299aef1> in <module>()
----> 1 test_mystruct(a)
_cython_magic_5119cecbaf7ff37e311b745d2b39dc32.pyx in _cython_magic_5119cecbaf7ff37e311b745d2b39dc32.test_mystruct (/auto/users/pwang/.cache/ipython/cython/_cython_magic_5119cecbaf7ff37e311b745d2b39dc32.c:1364)()
ValueError: Expected 1 dimension(s), got 1

My question is how to fix it? Thank you.

回答1:

This pyx compiles and imports ok:

import numpy as np
cimport numpy as np

cdef packed struct mystruct:
  int a[2]    # change from plain int
  float b[2]
  int c

def test_mystruct(mystruct[:] x):
  cdef:
    int k
    mystruct y

  for k in range(2):
    y = x[k]
    print y.a
    print y.b[0]
    print y.b[1]

dt='2i,2f,i'
b=np.zeros((3,),dtype=dt)
test_mystruct(b)

I started with the test example mentioned in my comment, and played with your case. I think the key change was to define the first element of the packed structure to be int a[2]. So if any element is an array, the first must an array to properly set up the structure.

Clearly an error that the test file isn't catching.

Defining the element as int a[1] doesn't work, possibly because the dtype removes such a dimension:

In [47]: np.dtype([('a', np.int, 1),  ('b', np.float, 2)])
Out[47]: dtype([('a', '<i4'), ('b', '<f8', (2,))])

Defining the dtype to get around this shouldn't be hard until the issue is raised and patched.


The struct could have a[1], but the array dtype would have to specify the size with a tuple: ('a','i',(1,)). ('a','i',1) would make the size ().

If one of the struct arrays is 2d, it looks like all of them have to be:

cdef packed struct mystruct:
  int a[1][1]
  float b[2][1]
  int c[2][2]

https://github.com/cython/cython/blob/c4c2e3d8bd760386b26dbd6cffbd4e30ba0a7d13/tests/memoryview/numpy_memoryview.pyx


Stepping back a bit, I wonder what's the point to processing a complex structured array in cython. For some operations wouldn't it work just as well to pass the fields as separate variables. For example myfunc(a['a'],a['b']) instead of myfunc(a).



回答2:

There is a general method to get the dtype for a c struct, but it involves a temporary variable:

cdef mystruct _tmp
dt = np.asarray(<mystruct[:1]>(&_tmp)).dtype

This requires at least numpy 1.5. See discussion here: https://github.com/scikit-learn/scikit-learn/pull/2298



标签: numpy cython