Numpy structured arrays: string type not understoo

2019-05-10 18:25发布

问题:

Here's what happens if I initialize a struct array with the same field names and types in different ways:

>>> a = np.zeros(2, dtype=[('x','int64'),('y','a')])
>>> a
array([(0L, ''), (0L, '')],
 dtype=[('x', '<i8'), ('y', 'S')])

So initializing with list of tuples works fine.

>>> mdtype = dict(names=['x','y'],formats=['int64','a'])
>>> mdtype
{'names': ['x', 'y'], 'formats': ['int64', 'a']}
>>> a = np.zeros(2,dtype=mdtype)
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
TypeError: data type not understood

So initializing with a dict doesn't, and the problem is the string type:

>>> mdtype = dict(names=['x','y'],formats=['int64','float64'])
>>> a = np.zeros(2,dtype=mdtype)
>>>

No problems there. Any ideas? Is this a Numpy bug?

Numpy version: 1.8.0

Python 2.7.6 (default, Nov 10 2013, 19:24:24) [MSC v.1500 64 bit (AMD64)] on win32

回答1:

As a workaround, it works if you specify the string width:

>>> mdtype = dict(names=['x','y'],formats=['int64','a1'])
>>> np.dtype(mdtype)
dtype([('x', '<i8'), ('y', 'S1')])

Probably related to this and this. If it isn't a bug, it is awfully close...