Slice MultiIndex pandas DataFrame by position

2019-05-13 00:44发布

问题:

I am currently trying to to slice a MuliIndex DataFrame that has three levels by position. I am using pandas 19.1

Level0  Level1  Level2      Value
03-00368    A   Item111     6.9
03-00368    A   Item333     19.2
03-00368    B   Item111     9.7
03-00368    B   Item222     17.4
04-00176    C   Item110     17.4
04-00176    C   Item111     9.7
04-00246    D   Item46      12.5
04-00246    D   Item66      5.6
04-00246    D   Item99      11.2
04-00247    E   Item23      12.5
04-00247    E   Item24      5.6
04-00247    E   Item111     11.2
04-00247    F   Item23      7.9
04-00247    F   Item24      9.7
04-00247    F   Item111     12.5
04-00247    G   Item46      11.2
04-00247    G   Item66      9.7
04-00247    G   Item999     9.7
04-00247    H   Item23      11.2
04-00247    H   Item94      7.9
04-00247    H   Item111     11.2
04-00247    I   Item46      5.6
04-00247    I   Item66      12.5
04-00247    I   Item888     11.2
04-00353    J   Item66      12.5
04-00353    J   Item99      12.5
04-00354    K   Item43      12.5
04-00354    K   Item94      12.5
04-00355    L   Item54      50
04-00355    L   Item99      50

Currently I can achieve:

df.loc[(slice('03-00368', '04-00361'), slice(None), slice(None)), :]

But in practice I won't know what the labels will be. I just want to select the first ten level 0's so I tried this(and many other things which are similar):

>>> df.iloc[(slice(0, 10), slice(None), slice(None)), :]
TypeError: unorderable types: int() >= NoneType()

The end goal is to limit the final number of rows displayed, without breaking up the Level0 index

>>>df.iloc[(0,1,), :]
Level0   Level1 Level2      Value
03-00368    A   Item111     6.9
03-00368    A   Item333     19.2

Notice that it only returned the first two rows, I would like the result to be:

Level0  Level1  Level2      Value
03-00368    A   Item111     6.9
03-00368    A   Item333     19.2
03-00368    B   Item111     9.7
03-00368    B   Item222     17.4
04-00176    C   Item110     17.4
04-00176    C   Item111     9.7

There are of hacky way to accomplish this but I'm posting because I want to know what I am doing wrong, or why I can't expect to be able to slice MultiIndexes this way.

回答1:

method 1
groupby + head

df.groupby(level=0).head(10)

method 2
Unnecessarily verbose
IndexSlice

df.sort_index().loc[pd.IndexSlice[df.index.levels[0][:10], :, :], :]

method 3
loc

df.loc[df.index.levels[0][:10].tolist()]


回答2:

You could groupby level and take the top two this way

df.groupby(level=0).head(2)