Pandas group by on groupby to list of lists

2020-07-02 10:11发布

问题:

Given a dataframe structured like:

rule_id | ordering | sequence_id
   1    |    0     |     12     
   1    |    1     |     13
   1    |    1     |     14
   2    |    0     |     1
   2    |    1     |     2
   2    |    2     |     12 

I need to transform it into:

rule_id |  sequences
   1    |  [[12],[13,14]]
   2    |  [[1],[2],[12]]

that seems like easy groupby into groupby to list operation - I can not however make it work in pandas.

df.groupby(['rule_id', 'ordering'])['sequence_id'].apply(list)

leaves me with

rule_id  ordering
1        0               [12]
         1            [13,14]
2        0                [1]
         1                [2]
         2               [12]

How does one apply another groupBy operation to furtherly concat results into one list?

回答1:

Use another groupby by first level of MultiIndex:

df.groupby(['rule_id', 'ordering'])['sequence_id'].apply(list).groupby(level=0).apply(list)