I've got a dataframe called data
. How would I rename the only one column header? For example gdp
to log(gdp)
?
data =
y gdp cap
0 1 2 5
1 2 3 9
2 8 7 2
3 3 4 7
4 6 7 7
5 4 8 3
6 8 2 8
7 9 9 10
8 6 6 4
9 10 10 7
A much faster implementation would be to use
list-comprehension
if you need to rename a single column.If the need arises to rename multiple columns, either use conditional expressions like:
Or, construct a mapping using a
dictionary
and perform thelist-comprehension
with it'sget
operation by setting default value as the old name:Timings:
Pandas 0.21+ Answer
There have been some significant updates to column renaming in version 0.21.
rename
method has added theaxis
parameter which may be set tocolumns
or1
. This update makes this method match the rest of the pandas API. It still has theindex
andcolumns
parameters but you are no longer forced to use them.set_index
method with theinplace
set toFalse
enables you to rename all the index or column labels with a list.Examples for Pandas 0.21+
Construct sample DataFrame:
Using
rename
withaxis='columns'
oraxis=1
(new for 0.21)or
Both result in the following:
It is still possible to use the old method signature:
The
rename
function also accepts functions that will be applied to each column name.or
Using
set_axis
with a list andinplace=False
You can supply a list to the
set_axis
method that is equal in length to the number of columns (or index). Currently,inplace
defaults toTrue
, butinplace
will be defaulted toFalse
in future releases.or
Why not use
df.columns = ['cap', 'log(gdp)', 'y']
?There is nothing wrong with assigning columns directly like this. It is a perfectly good solution.
The advantage of using
set_axis
is that it can be used as part of a method chain and that it returns a new copy of the DataFrame. Without it, you would have to store your intermediate steps of the chain to another variable before reassigning the columns.The
rename
show that it accepts a dict as a param forcolumns
so you just pass a dict with a single entry.Also see related