After rename column get keyerror

I have df:

df = pd.DataFrame({'a':[7,8,9],
                   'b':[1,3,5],
                   'c':[5,3,6]})

print (df)
   a  b  c
0  7  1  5
1  8  3  3
2  9  5  6

Then rename first value by this:

df.columns.values[0] = 'f'

All seems very nice:

print (df)
   f  b  c
0  7  1  5
1  8  3  3
2  9  5  6

print (df.columns)
Index(['f', 'b', 'c'], dtype='object')

print (df.columns.values)
['f' 'b' 'c']

If select b it works nice:

print (df['b'])
0    1
1    3
2    5
Name: b, dtype: int64

But if select a it return column f:

print (df['a'])
0    7
1    8
2    9
Name: f, dtype: int64

And if select f get keyerror.

print (df['f'])
#KeyError: 'f'

print (df.info())
#KeyError: 'f'

What is problem? Can somebody explain it? Or bug?

标签： pandas numpy multiple-columns rename

1条回答

别忘想泡老子

2楼-- · 2019-01-09 15:51

You aren't expected to alter the values attribute.

Try df.columns.values = ['a', 'b', 'c'] and you get:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-61-e7e440adc404> in <module>()
----> 1 df.columns.values = ['a', 'b', 'c']

AttributeError: can't set attribute

That's because pandas detects that you are trying to set the attribute and stops you.

However, it can't stop you from changing the underlying values object itself.

When you use rename, pandas follows up with a bunch of clean up stuff. I've pasted the source below.

Ultimately what you've done is altered the values without initiating the clean up. You can initiate it yourself with a followup call to _data.rename_axis (example can be seen in source below). This will force the clean up to be run and then you can access ['f']

df._data = df._data.rename_axis(lambda x: x, 0, True)
df['f']

0    7
1    8
2    9
Name: f, dtype: int64

Moral of the story: probably not a great idea to rename a column this way.

but this story gets weirder

This is fine

df = pd.DataFrame({'a':[7,8,9],
                   'b':[1,3,5],
                   'c':[5,3,6]})

df.columns.values[0] = 'f'

df['f']

0    7
1    8
2    9
Name: f, dtype: int64

This is not fine

df = pd.DataFrame({'a':[7,8,9],
                   'b':[1,3,5],
                   'c':[5,3,6]})

print(df)

df.columns.values[0] = 'f'

df['f']

KeyError:

Turns out, we can modify the values attribute prior to displaying df and it will apparently run all the initialization upon the first display. If you display it prior to changing the values attribute, it will error out.

weirder still

df = pd.DataFrame({'a':[7,8,9],
                   'b':[1,3,5],
                   'c':[5,3,6]})

print(df)

df.columns.values[0] = 'f'

df['f'] = 1

df['f']

   f  f
0  7  1
1  8  1
2  9  1

As if we didn't already know that this was a bad idea...

source for rename

def rename(self, *args, **kwargs):

    axes, kwargs = self._construct_axes_from_arguments(args, kwargs)
    copy = kwargs.pop('copy', True)
    inplace = kwargs.pop('inplace', False)

    if kwargs:
        raise TypeError('rename() got an unexpected keyword '
                        'argument "{0}"'.format(list(kwargs.keys())[0]))

    if com._count_not_none(*axes.values()) == 0:
        raise TypeError('must pass an index to rename')

    # renamer function if passed a dict
    def _get_rename_function(mapper):
        if isinstance(mapper, (dict, ABCSeries)):

            def f(x):
                if x in mapper:
                    return mapper[x]
                else:
                    return x
        else:
            f = mapper

        return f

    self._consolidate_inplace()
    result = self if inplace else self.copy(deep=copy)

    # start in the axis order to eliminate too many copies
    for axis in lrange(self._AXIS_LEN):
        v = axes.get(self._AXIS_NAMES[axis])
        if v is None:
            continue
        f = _get_rename_function(v)

        baxis = self._get_block_manager_axis(axis)
        result._data = result._data.rename_axis(f, axis=baxis, copy=copy)
        result._clear_item_cache()

    if inplace:
        self._update_inplace(result._data)
    else:
        return result.__finalize__(self)

0人赞添加讨论(0) 举报

After rename column get keyerror

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间