From what I understand, the recommended way to convert a NumPy array into a native Python list is to use ndarray.tolist
.
Alas, this doesn't seem to work recursively when using structured arrays. Indeed, some ndarray
objects are being referenced in the resulting list, unconverted:
>>> dtype = numpy.dtype([('position', numpy.int32, 3)])
>>> values = [([1, 2, 3],)]
>>> a = numpy.array(values, dtype=dtype)
>>> a.tolist()
[(array([1, 2, 3], dtype=int32),)]
I did write a simple function to workaround this issue:
def array_to_list(array):
if isinstance(array, numpy.ndarray):
return array_to_list(array.tolist())
elif isinstance(array, list):
return [array_to_list(item) for item in array]
elif isinstance(array, tuple):
return tuple(array_to_list(item) for item in array)
else:
return array
Which, when used, provides the expected result:
>>> array_to_list(a) == values
True
The problem with this function is that it duplicates the job of ndarray.tolist
by recreating each list/tuple that it outputs. Not optimal.
So the questions are:
- is this behaviour of
ndarray.tolist
to be expected?
- is there a better way to make this happen?
Just to generalize this a bit, I'll add an another field to your dtype
In [234]: dt = numpy.dtype([('position', numpy.int32, 3),('id','U3')])
In [235]: a=np.ones((3,),dtype=dt)
The repr
display does use lists and tuples:
In [236]: a
Out[236]:
array([([1, 1, 1], '1'), ([1, 1, 1], '1'), ([1, 1, 1], '1')],
dtype=[('position', '<i4', (3,)), ('id', '<U3')])
but as you note, tolist
does not expand the elements.
In [237]: a.tolist()
Out[237]: [(array([1, 1, 1]), '1'), (array([1, 1, 1]), '1'),
(array([1, 1, 1]), '1')]
Similarly, such an array can be created from the fully nested lists and tuples.
In [238]: a=np.array([([1,2,3],'str')],dtype=dt)
In [239]: a
Out[239]:
array([([1, 2, 3], 'str')],
dtype=[('position', '<i4', (3,)), ('id', '<U3')])
In [240]: a.tolist()
Out[240]: [(array([1, 2, 3]), 'str')]
There's no problem recreating the array from this incomplete recursion:
In [250]: np.array(a.tolist(),dtype=dt)
Out[250]:
array([([1, 2, 3], 'str')],
dtype=[('position', '<i4', (3,)), ('id', '<U3')])
This is the first that I've seen anyone use tolist
with a structured array like this, but I'm not too surprised. I don't know if developers would consider this a bug or not.
Why do you need a pure list/tuple rendering of this array?
I wonder if there's a function in numpy/lib/recfunctions.py
that addresses this.