pandas dataframe groupby datetime month

Consider a csv file:

string,date,number
a string,2/5/11 9:16am,1.0
a string,3/5/11 10:44pm,2.0
a string,4/22/11 12:07pm,3.0
a string,4/22/11 12:10pm,4.0
a string,4/29/11 11:59am,1.0
a string,5/2/11 1:41pm,2.0
a string,5/2/11 2:02pm,3.0
a string,5/2/11 2:56pm,4.0
a string,5/2/11 3:00pm,5.0
a string,5/2/14 3:02pm,6.0
a string,5/2/14 3:18pm,7.0

I can read this in, and reformat the date column into datetime format:

b=pd.read_csv('b.dat')
b['date']=pd.to_datetime(b['date'],format='%m/%d/%y %I:%M%p')

I have been trying to group the data by month. It seems like there should be an obvious way of accessing the month and grouping by that. But I can't seem to do it. Does anyone know how?

What I am currently trying is re-indexing by the date:

b.index=b['date']

I can access the month like so:

b.index.month

However I can't seem to find a function to lump together by month.

标签： python pandas datetime pandas-groupby

4条回答

疯言疯语

2楼-- · 2019-01-08 08:54

(update: 2018)

Note that pd.Timegrouper is depreciated and will be removed. Use instead:

 df.groupby(pd.Grouper(freq='M'))

0人赞添加讨论(0) 举报

我只想做你的唯一

3楼-- · 2019-01-08 08:55

Slightly alternative solution to @jpp's but outputting a YearMonth string:

df['YearMonth'] = pd.to_datetime(df['Date']).apply(lambda x: '{year}-{month}'.format(year=x.year, month=x.month))

res = df.groupby('YearMonth')['Values'].sum()

0人赞添加讨论(0) 举报

聊天终结者

4楼-- · 2019-01-08 08:59

Managed to do it:

df.groupby(by=[b.index.month, b.index.year])

df.groupby(pd.Grouper(freq='M'))  # update for v0.21+

0人赞添加讨论(0) 举报

▲ chillily

5楼-- · 2019-01-08 09:04

One solution which avoids MultiIndex is to create a new datetime column setting day = 1. Then group by this column. Trivial example below.

df = pd.DataFrame({'Date': pd.to_datetime(['2017-10-05', '2017-10-20']),
                   'Values': [5, 10]})

# normalize day to beginning of month
df['YearMonth'] = df['Date'] + pd.offsets.MonthBegin(1)

# two alternative methods
df['YearMonth'] = df['Date'] - pd.to_timedelta(df['Date'].dt.day-1, unit='D')
df['YearMonth'] = df['Date'].map(lambda dt: dt.replace(day=1))

g = df.groupby('YearMonth')

res = g['Values'].sum()

# YearMonth
# 2017-10-01    15
# Name: Values, dtype: int64

The subtle benefit of this solution is, unlike pd.Grouper, the grouper index is normalized to the beginning of each month rather than the end, and therefore you can easily extract groups via get_group:

some_group = g.get_group('2017-10-01')

Calculating the last day of October is slightly more cumbersome. pd.Grouper, as of v0.23, does support a convention parameter, but this is only applicable for a PeriodIndex grouper.

0人赞添加讨论(0) 举报

pandas dataframe groupby datetime month

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间