python flatten an array of array

2019-08-09 04:56发布

问题:

I have an array of array, something like that:

array([[array([33120, 28985,  9327, 45918, 30035, 17794, 40141,  1819, 43668],
      dtype=int64)],
       [array([33754, 24838, 17704, 21903, 17668, 46667, 17461, 32665],
      dtype=int64)],
       [array([46842, 26434, 39758, 27761, 10054, 21351, 22598, 34862, 40285,
       17616, 25146, 32645, 41276], dtype=int64)],
       ...,
       [array([24534,  8230, 14267,  9352,  3543, 29397,   900, 32398, 34262,
       37646, 11930, 37173], dtype=int64)],
       [array([25157], dtype=int64)],
       [array([ 8859, 20850, 19322,  8075], dtype=int64)]], dtype=object)

what I want is

     array([33120, 28985,  9327, 45918, 30035, 17794, 40141,  1819, 43668,33754, 24838, 17704, 21903, 17668, 46667, 17461, 32665,46842, 26434, 39758, 27761, 10054, 21351, 22598, 34862, 40285,17616, 25146, 32645, 41276
               ...,
24534,  8230, 14267,  9352,  3543, 29397,   900, 32398, 34262,
               37646, 11930, 37173,25157 8859, 20850, 19322,  8075, dtype=object)

I have searched some solution for that, but seems that all of them are for np.array or list, which are not work for array.

    functools.reduce(operator.iconcat, orders2.values.tolist(), [])
[array([33120, 28985,  9327, 45918, 30035, 17794, 40141,  1819, 43668],
       dtype=int64),
 array([33754, 24838, 17704, 21903, 17668, 46667, 17461, 32665],
       dtype=int64),
 array([46842, 26434, 39758, 27761, 10054, 21351, 22598, 34862, 40285,
        17616, 25146, 32645, 41276], dtype=int64),...
    orders2.values.flatten()
array([array([33120, 28985,  9327, 45918, 30035, 17794, 40141,  1819, 43668],
      dtype=int64),
       array([33754, 24838, 17704, 21903, 17668, 46667, 17461, 32665],
      dtype=int64),

I couldnt even convert the array to list

[sub.tolist() for sub in orders2.values]
    [array([33120, 28985,  9327, 45918, 30035, 17794, 40141,  1819, 43668],
           dtype=int64),
     array([33754, 24838, 17704, 21903, 17668, 46667, 17461, 32665],
           dtype=int64),
     array([46842, 26434, 39758, 27761, 10054, 21351, 22598, 34862, 40285,
            17616, 25146, 32645, 41276], dtype=int64),...
        orders2.values.flatten()
    array([array([33120, 28985,  9327, 45918, 30035, 17794, 40141,  1819, 43668],
          dtype=int64),
           array([33754, 24838, 17704, 21903, 17668, 46667, 17461, 32665],
          dtype=int64),...

I find it is hard to get some information about array class ,everything is list or np.array

回答1:

Use a list comprehension, then convert back to array:

>>> arr = array([[array([33120, 28985,  9327, 45918, 30035, 17794, 40141,  1819, 43668],
      dtype='int64')],
       [array([33754, 24838, 17704, 21903, 17668, 46667, 17461, 32665],
      dtype='int64')],
       [array([46842, 26434, 39758, 27761, 10054, 21351, 22598, 34862, 40285,
       17616, 25146, 32645, 41276], dtype='int64')],
       [array([24534,  8230, 14267,  9352,  3543, 29397,   900, 32398, 34262,
       37646, 11930, 37173], dtype='int64')],
       [array([25157], dtype='int64')],
       [array([ 8859, 20850, 19322,  8075], dtype='int64')]], dtype=object)
>>> array([x for i in arr.tolist() for x in i[0].tolist()])
array([33120, 28985,  9327, 45918, 30035, 17794, 40141,  1819, 43668,
       33754, 24838, 17704, 21903, 17668, 46667, 17461, 32665, 46842,
       26434, 39758, 27761, 10054, 21351, 22598, 34862, 40285, 17616,
       25146, 32645, 41276, 24534,  8230, 14267,  9352,  3543, 29397,
         900, 32398, 34262, 37646, 11930, 37173, 25157,  8859, 20850,
       19322,  8075])
>>> 


回答2:

In [141]: array=np.array; 
     ...: arr = array([[array([33120, 28985,  9327, 45918, 30035, 17794, 40141,  1819, 43668], 
     ...:       dtype='int64')], 
     ...:        [array([33754, 24838, 17704, 21903, 17668, 46667, 17461, 32665], 
     ...:       dtype='int64')], 
     ...:        [array([46842, 26434, 39758, 27761, 10054, 21351, 22598, 34862, 40285, 
     ...:        17616, 25146, 32645, 41276], dtype='int64')], 
     ...:        [array([24534,  8230, 14267,  9352,  3543, 29397,   900, 32398, 34262, 
     ...:        37646, 11930, 37173], dtype='int64')], 
     ...:        [array([25157], dtype='int64')], 
     ...:        [array([ 8859, 20850, 19322,  8075], dtype='int64')]], dtype=object)                           
In [142]: np.concatenate(arr.ravel())                                                                           
Out[142]: 
array([33120, 28985,  9327, 45918, 30035, 17794, 40141,  1819, 43668,
       33754, 24838, 17704, 21903, 17668, 46667, 17461, 32665, 46842,
       26434, 39758, 27761, 10054, 21351, 22598, 34862, 40285, 17616,
       25146, 32645, 41276, 24534,  8230, 14267,  9352,  3543, 29397,
         900, 32398, 34262, 37646, 11930, 37173, 25157,  8859, 20850,
       19322,  8075])

The shape is 2d:

In [143]: arr.shape                                                                                             
Out[143]: (6, 1)

arr.ravel() makes it 1d (6,), np.concatenate joins a list (or iterable) of arrays.