I have a dataframe df1 that looks like this
rootID parentID jobID time
0 A A B 2019-01-30 14:33:21.339469
1 A A C 2019-01-30 14:33:21.812381
2 A C D 2019-01-30 15:33:21.812381
3 E E F 2019-01-30 15:33:21.812381
4 E F G 2019-01-30 16:33:21.812381
5 E F H 2019-01-30 17:33:21.812381
6 E G I 2019-01-30 18:33:21.812381
I want to pivot this dataframe to the following form (df2)
rootID subID1 subID2 subID3 #subFlows
0 A B 1
1 A C D 2
3 E F G I 3
4 E F H 2
I have tried
df2 = (df1.assign(g=df.groupby('rootID').cumcount().add(1))
.pivot('rootID','g','jobID')
.add_prefix('subID')
.fillna("")
.reset_index())
df2['#subFlows'] = (df2 != "").sum(axis=1).astype(int).sub(1)
Which returns a dataframe like
rootID subID1 subID2 subID3
0 A B C D
1 E F G H
But I want, as described above, to seperate the non nested subIDs.
Any idea how I would do this?