I have a list of lists. Each sublist has a length that varies between 1 and 100. Each sublist contains a particle ID at different times in a set of data. I would like to form lists of all particle IDs at a given time. To do this I could use something like:
list = [[1,2,3,4,5],[2,6,7,8],[1,3,6,7,8]]
list2 = [item[0] for item in list]
list2 would contain the first elements of each sublist in list. I would like to do this operation not just for the first element, but for every element between 1 and 100. My problem is that element number 100 (or 66 or 77 or whatever) does not exists for every sublist.
Is there some way of creating a lists of lists, where each sublist is the list of all particle IDs at a given time.
I have thought about trying to use numpy arrays to solve this problem, as if the lists were all the same length this would be trivial. I have tried adding -1's to the end of each list to make them all the same length, and then masking the negative numbers, but this hasn't worked for me so far. I will use the list of IDs at a given time to slice another separate array:
pos = pos[satIDs]
You could append
numpy.nan
to your short lists and afterwards create a numpy arrayAfterwards you can use numpy slicing as usual.
--update--
You could use
itertools.zip_longest
. This willzip
the lists together and insertNone
when one of the lists is exhausted.If you don't want the
None
elements, you can filter them out:Approach #1
One almost* vectorized approach could be suggested that goes along creating ID based on the new order and splitting, like so -
*There is a loop comprehension involved at the start, but being meant to collect just the lengths of the input elements of the list, its effect on the total runtime should be minimal.
Sample run -
Approach #2
Here's another approach that creates a
2D
array, which is easier to index and trace back to original input elements. This uses NumPy broadcasting alongwith boolean indexing. The implementation would look something like this -Sample run -
So, now each column of the output would correspond to your ID based outputting.
If you want it with a
one-line forloop
and in anarray
you can do this:And if you want to know which id is from which list you can do this:
list2 would be like this: