Pandas reindex and fill missing values: “Index mus

2020-02-11 08:08发布

问题:

In answering this stackoverflow question, I found some interesting behavior when using a fill method while reindexing a dataframe.

This old bug report in pandas says that df.reindex(newIndex,method='ffill') should be equivalent to df.reindex(newIndex).ffill(), but that is NOT the behavior I'm witnessing

Here's a code snippet that illustrates the behavior

df = pd.DataFrame({'values': 2}, index=pd.DatetimeIndex(['2016-06-02', '2016-05-04', '2016-06-03']))
newIndex = pd.DatetimeIndex(['2016-05-04', '2016-06-01', '2016-06-02', '2016-06-03', '2016-06-05'])
print(df.reindex(newIndex).ffill())
print(df.reindex(newIndex, method='ffill'))

The first print statement works as expected. The second raises a

ValueError: index must be monotonic increasing or decreasing

What's going on here?


EDIT: Note that the sample df intentionally has a non-monotonic index. The question pertains to the order of operations in df.reindex(newIndex, method='ffil'). My expectation is as the bug-report says it should work- first reindex with the new index and then fill.

As you can see, the newIndex.is_monotonic is True, and the fill works when called separately but fails when called as a parameter to reindex.

回答1:

Some element of reindex requires the incoming index to be sorted. I'm deducing that when method is passed, it fails to presort the incoming index and subsequently fails. I'm drawing this conclusion based on the fact that this works:

print df.sort_index().reindex(newIndex.sort_values(), method='ffill')


回答2:

It seems that this needs to be done on the columns as well.

In[76]: frame = DataFrame(np.arange(9).reshape((3, 3)), index=['a', 'c', 'd'],columns=['Ohio', 'Texas', 'California'])

In[77]: frame.reindex(index=['a','b','c','d'],method='ffill',columns=states)
---> ValueError: index must be monotonic increasing or decreasing

In[78]: frame.reindex(index=['a','b','c','d'],method='ffill',columns=states.sort())

Out[78]:
  Ohio  Texas  California
a     0      1           2
b     0      1           2
c     3      4           5
d     6      7           8