I have df
:
df = pd.DataFrame({'a':[7,8,9],
'b':[1,3,5],
'c':[5,3,6]})
print (df)
a b c
0 7 1 5
1 8 3 3
2 9 5 6
Then rename first value by this:
df.columns.values[0] = 'f'
All seems very nice:
print (df)
f b c
0 7 1 5
1 8 3 3
2 9 5 6
print (df.columns)
Index(['f', 'b', 'c'], dtype='object')
print (df.columns.values)
['f' 'b' 'c']
If select b
it works nice:
print (df['b'])
0 1
1 3
2 5
Name: b, dtype: int64
But if select a
it return column f
:
print (df['a'])
0 7
1 8
2 9
Name: f, dtype: int64
And if select f
get keyerror.
print (df['f'])
#KeyError: 'f'
print (df.info())
#KeyError: 'f'
What is problem? Can somebody explain it? Or bug?
You aren't expected to alter the
values
attribute.Try
df.columns.values = ['a', 'b', 'c']
and you get:That's because
pandas
detects that you are trying to set the attribute and stops you.However, it can't stop you from changing the underlying
values
object itself.When you use
rename
,pandas
follows up with a bunch of clean up stuff. I've pasted the source below.Ultimately what you've done is altered the values without initiating the clean up. You can initiate it yourself with a followup call to
_data.rename_axis
(example can be seen in source below). This will force the clean up to be run and then you can access['f']
Moral of the story: probably not a great idea to rename a column this way.
but this story gets weirder
This is fine
This is not fine
Turns out, we can modify the
values
attribute prior to displayingdf
and it will apparently run all the initialization upon the firstdisplay
. If you display it prior to changing thevalues
attribute, it will error out.weirder still
As if we didn't already know that this was a bad idea...
source for
rename