Slicing and assigning values multi-indexed pandas

I want to select and change the value of a dataframe cell. There are 2 indices used for this dataframe: 'datetime' and 'idx'. Both contain labels which are unique and sequential. 'datetime' index has datetime label of datetime type, and 'idx' has integer valued labels.

import numpy as np
import pandas as pd

dt = pd.date_range("2010-10-01 00:00:00", periods=5, freq='H')
d = {'datetime': dt, 'a': np.arange(len(dt))-1,'b':np.arange(len(dt))+1}
df = pd.DataFrame(data=d)
df.set_index(keys='datetime',inplace=True,drop=True)
df.sort_index(axis=0,level='datetime',ascending=False,inplace=True)

df.loc[:,'idx'] = np.arange(0, len(df),1)+5
df.set_index('idx',drop=True,inplace=True,append=True)
print(df)

'Here is the dataframe:

                         a  b
datetime            idx      
2010-10-01 04:00:00 5    3  5
2010-10-01 03:00:00 6    2  4
2010-10-01 02:00:00 7    1  3
2010-10-01 01:00:00 8    0  2
2010-10-01 00:00:00 9   -1  1

'Say I want to get the row where idx=5. How do I do that? I could use this:

print(df.iloc[0])

Then I will get result below:

a    3
b    5
Name: (2010-10-01 04:00:00, 5), dtype: int32

But I want to access and set the value in this cell where idx=5, column='a', by specifying idx value, and column name 'a'. How do I do that?

Please advice.

标签： python pandas slice multi-index setvalue

3条回答

爷、活的狠高调

2楼-- · 2019-07-20 16:39

One more way to do it.

Select value:

df.xs(5, level=-1)

Set value:

df.set_value(df.xs(5, level=-1).index, 'a', 100)

0人赞添加讨论(0) 举报

贪生不怕死

3楼-- · 2019-07-20 16:43

You can use DatFrame.query() method for querying MultiIndex DFs:

In [54]: df
Out[54]:
                         a  b
datetime            idx
2010-10-01 04:00:00 5    3  5
2010-10-01 03:00:00 6    2  4
2010-10-01 02:00:00 7    1  3
2010-10-01 01:00:00 8    0  2
2010-10-01 00:00:00 9   -1  1

In [55]: df.query('idx==5')
Out[55]:
                         a  b
datetime            idx
2010-10-01 04:00:00 5    3  5

In [56]: df.query('idx==5')['a']
Out[56]:
datetime             idx
2010-10-01 04:00:00  5      3
Name: a, dtype: int32

Or you can use DataFrame.eval() method if you need to set/update some cells:

In [61]: df.loc[df.eval('idx==5'), 'a'] = 100

In [62]: df
Out[62]:
                           a  b
datetime            idx
2010-10-01 04:00:00 5    100  5
2010-10-01 03:00:00 6      2  4
2010-10-01 02:00:00 7      1  3
2010-10-01 01:00:00 8      0  2
2010-10-01 00:00:00 9     -1  1

Explanation:

In [59]: df.eval('idx==5')
Out[59]:
datetime             idx
2010-10-01 04:00:00  5       True
2010-10-01 03:00:00  6      False
2010-10-01 02:00:00  7      False
2010-10-01 01:00:00  8      False
2010-10-01 00:00:00  9      False
dtype: bool

In [60]: df.loc[df.eval('idx==5')]
Out[60]:
                         a  b
datetime            idx
2010-10-01 04:00:00 5    3  5

PS if your original MultiIndex doesn't have names, you can easily set them using rename_axis() method:

df.rename_axis(('datetime','idx')).query(...)

Alternative (bit more expensive) solution - using sort_index() + pd.IndexSlice[]:

In [106]: df.loc[pd.IndexSlice[:,5], ['a']]
...
skipped
...
KeyError: 'MultiIndex Slicing requires the index to be fully lexsorted tuple len (2), lexsort depth (0)'

so we would need to sort index first:

In [107]: df.sort_index().loc[pd.IndexSlice[:,5], ['a']]
Out[107]:
                         a
datetime            idx
2010-10-01 04:00:00 5    3

0人赞添加讨论(0) 举报

虎瘦雄心在

4楼-- · 2019-07-20 16:46

In case to be used in a loop in a large data set, I realized it is about 20 times faster to extract the columns of the dataframe to pandas Series type first, then continue with the selecting and assigning operations.

Even faster (almost 10000 times faster) to a numpy array, if the index labels happen to be consecutive integers.

MYGz's solution was good, but in my use case in a for-loop, it was too slow to be feasible as these operations took most of the time.

0人赞添加讨论(0) 举报

Slicing and assigning values multi-indexed pandas

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间