Pandas MultiIndex DataFrame.rolling offset

2019-06-24 03:07发布

问题:

Why can't I use an offset when rolling a multi-index DataFrame? For example, with:

rng = pd.date_range('2017-01-03', periods=20, freq='8D')
i = pd.MultiIndex.from_product([['A','B','C'], rng], names=['Name','Date'])
df = pd.DataFrame(np.random.randn(60), i, columns=['Vals'])

If I try grouping and rolling with an offset I get "ValueError: window must be an integer":

df['Avg'] = df.groupby(['Name'])['Vals'].rolling('30D').mean() # << Why doesn't this work?

Not that these following variants meet my needs, but note that grouping and rolling with an int works:

df['Avg'] = df.groupby(['Name'])['Vals'].rolling(4).mean()

And I can roll with an offset on a single-index subset of the DataFrame:

d = df.loc['A']
d['Avg'] = d['Vals'].rolling('30D').mean()

If it's truly impossible to do rolling with offsets on multi-index DataFrames, what would be the most efficient workaround to apply one to each level-0 index item?

回答1:

In order to use an offset like '30D' you need a simple date index. In this case the simplest way to achieve that is to move 'Name' out of the index with reset_index(level='Name'), leaving you with only 'Date' as the index:

df['Avg'] = df.reset_index(level='Name').groupby(['Name'])['Vals'].rolling('30D').mean()