可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I have the following multi-level data frame:

Year   2016                    2017                 
Quarter  3   4                 1                 2      
Month  Sep   Oct   Nov   Dec   Jan  Feb    Mar   Apr   May   Jun
A      0.16  0.95  0.92  0.45  0.30  0.35  0.95  0.88  0.18  0.10
B      0.88  0.67  0.07  0.70  0.74  0.33  0.77  0.21  0.81  0.85
C      0.79  0.56  0.13  0.19  0.94  0.23  0.72  0.62  0.66  0.93

I want to sum up over the quarters, so that the final result is as follows:

Year     2016        2017   
Quarter  3     4     1     2
A        0.16  2.32  1.60  1.16
B        0.88  1.44  1.85  1.86
C        0.79  0.89  1.89  2.21

I tried with the following formula:

df= df.groupby('Quarter').transform('sum')

but I get this error:

KeyError: 'Quarter'

Clearly that's the wrong way to approach it. Could anyone please a solution or to approach finding one.

Additional information

The output of the df.index command is: Index([u'A', u'B',u'C'],dtype='object', name=u'DF name')

Thanks!

回答1:

Just using sum

df.sum(level=[0,1],axis=1)
Out[14]: 
year    2016        2017      
quater     3     4     1     2
A       0.16  2.32  1.60  1.16
B       0.88  1.44  1.84  1.87
C       0.79  0.88  1.89  2.21

回答2:

When you use groupby in pandas you group data based on columns data. But you have your groups in rows. All you need is to transpose your df before grouping and after.

Here is code you need:

First let's create df like yours:

import pandas as pd

index = pd.MultiIndex.from_tuples([(2016, 3, 'Sep', 'A'),
                                   (2016, 3, 'Sep', 'B'),
                                   (2016, 3, 'Sep', 'C'),
                                   (2016, 4, 'Oct', 'A'),
                                   (2016, 4, 'Oct', 'B'),
                                   (2016, 4, 'Oct', 'C'),
                                   (2016, 4, 'Nov', 'A'),
                                   (2016, 4, 'Nov', 'B'),
                                   (2016, 4, 'Nov', 'C'),
                                   (2017, 1, 'Jan', 'A'),
                                   (2017, 1, 'Jan', 'B'),
                                   (2017, 1, 'Jan', 'C'),
                                   (2017, 1, 'Feb', 'A'),
                                   (2017, 1, 'Feb', 'B'),
                                   (2017, 1, 'Feb', 'C'),
                                   ], names=['Year', 'Quarter', 'Month', 'Group'])

raw_df = pd.Series(range(15), index=index)

df = raw_df.unstack([0,1,2])
print(df)

Output:

Year    2016         2017    
Quarter    3   4        1    
Month    Sep Oct Nov  Jan Feb
Group                        
A          0   3   6    9  12
B          1   4   7   10  13
C          2   5   8   11  14

Looks exactly like in your example. Now you need only 1 line of code:

new_df = df.transpose().groupby(['Year', 'Quarter']).sum().transpose()
print(new_df)

Here's your output:

Year    2016     2017
Quarter    3   4    1
Group                
A          0   9   21
B          1  11   23
C          2  13   25

Good luck!

回答3:

df.sum(level=['Year', 'Quater'], axis=1)

EDIT: thanks Matt Messersmith for note on transpose

Reproducing full example:

import pandas as pd

tuples = [(2016, 3, 'Sep'), (2016, 4, 'Oct'), (2016, 4, 'Nov'),
          (2016, 4, 'Dec'), (2017, 1, 'Jan'), (2017, 1, 'Feb'),
          (2017, 1, 'Mar'), (2017, 2, 'Apr'), (2017, 2, 'May'),
          (2017, 2, 'Jun')]

index = pd.MultiIndex.from_tuples(tuples, names=['Year', 'Quater', 'Month'])

df = pd.DataFrame([(0.16, 0.88, 0.79), (0.95, 0.67, 0.56), (0.92, 0.07, 0.13), 
                   (0.45, 0.70, 0.19), (0.30, 0.74, 0.94), (0.35, 0.33, 0.23), 
                   (0.95, 0.77, 0.72), (0.88, 0.21, 0.62), (0.18, 0.81, 0.66), 
                   (0.10, 0.85, 0.93)], 
                  index=index, 
                  columns=["A", "B", "C"])
df = df.T
print(df.sum(level=['Year', 'Quater'], axis=1))

回答4:

You can try transpose the data then sum it by the index you want and transpose it back the way you want

df.transpose().sum(level=[0,1]).transpose()

In case you needed to pivot your results.

How to group a python data frame by multilevel row

问题:

回答1:

回答2:

回答3:

回答4:

收藏的人(0)

How to group a python data frame by multilevel row

问题:

回答1:

回答2:

回答3:

回答4:

收藏的人(0)

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮