Truly recursive `tolist()` for NumPy structured ar

2019-07-13 20:41发布

From what I understand, the recommended way to convert a NumPy array into a native Python list is to use ndarray.tolist.

Alas, this doesn't seem to work recursively when using structured arrays. Indeed, some ndarray objects are being referenced in the resulting list, unconverted:

>>> dtype = numpy.dtype([('position', numpy.int32, 3)])
>>> values = [([1, 2, 3],)]
>>> a = numpy.array(values, dtype=dtype)
>>> a.tolist()
[(array([1, 2, 3], dtype=int32),)]

I did write a simple function to workaround this issue:

def array_to_list(array):
    if isinstance(array, numpy.ndarray):
        return array_to_list(array.tolist())
    elif isinstance(array, list):
        return [array_to_list(item) for item in array]
    elif isinstance(array, tuple):
        return tuple(array_to_list(item) for item in array)
    else:
        return array

Which, when used, provides the expected result:

>>> array_to_list(a) == values
True

The problem with this function is that it duplicates the job of ndarray.tolist by recreating each list/tuple that it outputs. Not optimal.

So the questions are:

  • is this behaviour of ndarray.tolist to be expected?
  • is there a better way to make this happen?

1条回答
趁早两清
2楼-- · 2019-07-13 21:28

Just to generalize this a bit, I'll add an another field to your dtype

In [234]: dt = numpy.dtype([('position', numpy.int32, 3),('id','U3')])

In [235]: a=np.ones((3,),dtype=dt)

The repr display does use lists and tuples:

In [236]: a
Out[236]: 
array([([1, 1, 1], '1'), ([1, 1, 1], '1'), ([1, 1, 1], '1')], 
  dtype=[('position', '<i4', (3,)), ('id', '<U3')])

but as you note, tolist does not expand the elements.

In [237]: a.tolist()
Out[237]: [(array([1, 1, 1]), '1'), (array([1, 1, 1]), '1'), 
   (array([1, 1, 1]), '1')]

Similarly, such an array can be created from the fully nested lists and tuples.

In [238]: a=np.array([([1,2,3],'str')],dtype=dt)
In [239]: a
Out[239]: 
array([([1, 2, 3], 'str')], 
  dtype=[('position', '<i4', (3,)), ('id', '<U3')])
In [240]: a.tolist()
Out[240]: [(array([1, 2, 3]), 'str')]

There's no problem recreating the array from this incomplete recursion:

In [250]: np.array(a.tolist(),dtype=dt)
Out[250]: 
array([([1, 2, 3], 'str')], 
      dtype=[('position', '<i4', (3,)), ('id', '<U3')])

This is the first that I've seen anyone use tolist with a structured array like this, but I'm not too surprised. I don't know if developers would consider this a bug or not.

Why do you need a pure list/tuple rendering of this array?

I wonder if there's a function in numpy/lib/recfunctions.py that addresses this.

查看更多
登录 后发表回答