I'm trying to remove the innermost nesting in a list of lists of single element length lists. Do you know a relatively easy way (converting to NumPy arrays is fine) to get from:
[[[1], [2], [3], [4], [5]], [[6], [7], [8]], [[11], [12]]]
to this?:
[[1, 2, 3, 4, 5], [6, 7, 8], [11, 12]]
Also, the real lists I'm trying to do this for contain datetime objects rather than ints in the example. And the initial collection of lists will be of varying lengths.
Alternatively, it would be fine if there were nans in the original list so that the length of each list is identical as long as the nans aren't present in the output list. i.e.
[[[1], [2], [3], [4], [5]],
[[6], [7], [8], [nan], [nan]],
[[11], [12], [nan], [nan], [nan]]]
to this:
[[1, 2, 3, 4, 5], [6, 7, 8], [11, 12]]
If the nesting is always consistent, then this is trivial:
In [2]: import itertools
In [3]: nested = [ [ [1],[2],[3],[4], [5] ], [ [6],[7],[8] ] , [ [11],[12] ] ]
In [4]: unested = [list(itertools.chain(*sub)) for sub in nested]
In [5]: unested
Out[5]: [[1, 2, 3, 4, 5], [6, 7, 8], [11, 12]]
Note, the solutions that leverage add
with lists are going to give you O(n^2) performance where n is the number of sub-sublists that are being merged within each sublist.
>>> from operator import add
>>> lists = [ [ [1],[2],[3],[4], [5] ], [ [6],[7],[8] ] , [ [11],[12] ] ]
>>> [reduce(add, lst) for lst in lists]
[[1, 2, 3, 4, 5], [6, 7, 8], [11, 12]]
This is not a very efficient, as it rebuilds a list each time add is called.
Alternatively you can use sum
or a simple list comprehension, as seen in the other answers.
How about np.squeeze
?
Remove single-dimensional entries from the shape of an array.
arr = [ [ [1],[2],[3],[4], [5] ], [ [6],[7],[8] ] , [ [11],[12] ] ]
>>> arr
[[[1], [2], [3], [4], [5]], [[6], [7], [8]], [[11], [12]]]
>>> [np.squeeze(i) for i in arr]
[array([1, 2, 3, 4, 5]), array([6, 7, 8]), array([11, 12])]
Not necessarily the innermost (ie independent of how many dimensions) dimension though. But your question specifies "list of lists"
As in your case, innermost object has just one element. You may access the value based on index instead of using some additional function. For example:
>>> [[y[0] for y in x] for x in my_list]
[[1, 2, 3, 4, 5], [6, 7, 8], [11, 12]]
If there is possibility that your inner-most list may have more than one element, you may do:
>>> [[z for y in x for z in y] for x in my_list]
[[1, 2, 3, 4, 5], [6, 7, 8], [11, 12]]
Because this question looks fun!
I used a recursive function that unpacks a list if it only has one value.
def make_singular(l):
try:
if len(l) == 1:
return l[0]
else:
return [make_singular(l_) for l_ in l]
except:
return l
nest = [ [ [1],[2],[3],[4], [5] ], [ [6],[7],[8] ] , [ [11],[12] ] ]
make_singular(nest)
[[1, 2, 3, 4, 5], [6, 7, 8], [11, 12]]
If you know the level of nesting then one of the list comprehensions is easy.
In [129]: ll=[ [ [1],[2],[3],[4], [5] ], [ [6],[7],[8] ] , [ [11],[12] ] ]
In [130]: [[j[0] for j in i] for i in ll] # simplest
Out[130]: [[1, 2, 3, 4, 5], [6, 7, 8], [11, 12]]
If the criteria is just to remove an inner layer of nesting, regardless of how deep it occurs, the code will require more thought. I'd probably try to write it as a recursive function.
The np.nan
(or None
) padding doesn't help with the list version
In [131]: lln=[ [ [1],[2],[3],[4],[5] ], [ [6],[7],[8],[nan],[nan]] , [ [11],[12],[nan],[nan],[nan] ] ]
In [132]: [[j[0] for j in i if j[0] is not np.nan] for i in lln]
Out[132]: [[1, 2, 3, 4, 5], [6, 7, 8], [11, 12]]
The padding does let us make a 3d array, which can then easily be squeezed:
In [135]: arr = np.array(lln)
In [136]: arr.shape
Out[136]: (3, 5, 1)
In [137]: arr = arr[:,:,0]
In [138]: arr
Out[138]:
array([[ 1., 2., 3., 4., 5.],
[ 6., 7., 8., nan, nan],
[ 11., 12., nan, nan, nan]])
but then there's a question of how to remove those nan
and create ragged sublists.
Masked arrays might let you work with a 2d array without being bothered by these nan
:
In [141]: M = np.ma.masked_invalid(arr)
In [142]: M
Out[142]:
masked_array(data =
[[1.0 2.0 3.0 4.0 5.0]
[6.0 7.0 8.0 -- --]
[11.0 12.0 -- -- --]],
mask =
[[False False False False False]
[False False False True True]
[False False True True True]],
fill_value = 1e+20)
In [144]: M.sum(axis=1) # e.g. sublist sums
Out[144]:
masked_array(data = [15.0 21.0 23.0],
mask = [False False False],
fill_value = 1e+20)
Removing the nan
from arr
is probably easiest with a list comprehension. The values are float because np.nan
is float.
In [153]: [[i for i in row if ~np.isnan(i)] for row in arr]
Out[153]: [[1.0, 2.0, 3.0, 4.0, 5.0], [6.0, 7.0, 8.0], [11.0, 12.0]]
So the padding doesn't help.
If the padding was with None
, then the array would be object dtype, which is closer to a nested list in character.
In [163]: lln
Out[163]:
[[[1], [2], [3], [4], [5]],
[[6], [7], [8], [None], [None]],
[[11], [12], [None], [None], [None]]]
In [164]: arr=np.array(lln)[:,:,0]
In [165]: arr
Out[165]:
array([[1, 2, 3, 4, 5],
[6, 7, 8, None, None],
[11, 12, None, None, None]], dtype=object)
In [166]: [[i for i in row if i is not None] for row in arr]
Out[166]: [[1, 2, 3, 4, 5], [6, 7, 8], [11, 12]]
Another array approach is to count the number of valid elements at the 2nd level; flatten the whole thing, and then split
.
A recursive function:
def foo(alist):
if len(alist)==1:
return alist[0]
else:
return [foo(i) for i in alist if foo(i) is not None]
In [200]: ll=[ [ [1],[2],[3],[4], [5] ], [ [6],[7],[8] ] , [11], [[[12],[13]]]]
In [201]: foo(ll)
Out[201]: [[1, 2, 3, 4, 5], [6, 7, 8], 11, [[12], [13]]]
In [202]: lln=[ [ [1],[2],[3],[4],[5] ], [ [6],[7],[8],[None],[None]] , [ [11],[12],[None],[None],[None] ] ]
In [203]: foo(lln)
Out[203]: [[1, 2, 3, 4, 5], [6, 7, 8], [11, 12]]
It recurses down to the level where lists have length 1. It is still fragile, and misbehaves if the nesting levels vary. Conceptually it's quite similar to @piRSquared's
answer.