Rename a single column header in a pandas datafram

2019-01-12 16:37发布

I've got a dataframe called data. How would I rename the only one column header? For example gdp to log(gdp)?

data =
    y  gdp  cap
0   1    2    5
1   2    3    9
2   8    7    2
3   3    4    7
4   6    7    7
5   4    8    3
6   8    2    8
7   9    9   10
8   6    6    4
9  10   10    7

3条回答
我欲成王,谁敢阻挡
2楼-- · 2019-01-12 16:49

A much faster implementation would be to use list-comprehension if you need to rename a single column.

df.columns = ['log(gdp)' if x=='gdp' else x for x in df.columns]

If the need arises to rename multiple columns, either use conditional expressions like:

df.columns = ['log(gdp)' if x=='gdp' else 'cap_mod' if x=='cap' else x for x in df.columns]

Or, construct a mapping using a dictionary and perform the list-comprehension with it's get operation by setting default value as the old name:

col_dict = {'gdp': 'log(gdp)', 'cap': 'cap_mod'}   ## key→old name, value→new name

df.columns = [col_dict.get(x, x) for x in df.columns]

Timings:

%%timeit
df.rename(columns={'gdp':'log(gdp)'}, inplace=True)
10000 loops, best of 3: 168 µs per loop

%%timeit
df.columns = ['log(gdp)' if x=='gdp' else x for x in df.columns]
10000 loops, best of 3: 58.5 µs per loop
查看更多
三岁会撩人
3楼-- · 2019-01-12 16:59

Pandas 0.21+ Answer

There have been some significant updates to column renaming in version 0.21.

  • The rename method has added the axis parameter which may be set to columns or 1. This update makes this method match the rest of the pandas API. It still has the index and columns parameters but you are no longer forced to use them.
  • The set_index method with the inplace set to False enables you to rename all the index or column labels with a list.

Examples for Pandas 0.21+

Construct sample DataFrame:

df = pd.DataFrame({'y':[1,2,8], 'gdp':[2,3,7], 'cap':[5,9,2]}, 
                  columns=['y','gdp', 'cap'])

   cap  gdp  y
0    5    2  1
1    9    3  2
2    2    7  8

Using rename with axis='columns' or axis=1 (new for 0.21)

df.rename({'gdp':'log(gdp)'}, axis='columns')

or

df.rename({'gdp':'log(gdp)'}, axis=1)

Both result in the following:

   cap  log(gdp)  y
0    5         2  1
1    9         3  2
2    2         7  8

It is still possible to use the old method signature:

df.rename(columns={'gdp':'log(gdp)'})

The rename function also accepts functions that will be applied to each column name.

df.rename(lambda x: 'log(gdp)' if x == 'gdp' else x, axis='columns')

or

df.rename(lambda x: 'log(gdp)' if x == 'gdp' else x, axis=1)

Using set_axis with a list and inplace=False

You can supply a list to the set_axis method that is equal in length to the number of columns (or index). Currently, inplace defaults to True, but inplace will be defaulted to False in future releases.

df.set_axis(['cap', 'log(gdp)', 'y'], axis='columns', inplace=False)

or

df.set_axis(['cap', 'log(gdp)', 'y'], axis=1, inplace=False)

Why not use df.columns = ['cap', 'log(gdp)', 'y']?

There is nothing wrong with assigning columns directly like this. It is a perfectly good solution.

The advantage of using set_axis is that it can be used as part of a method chain and that it returns a new copy of the DataFrame. Without it, you would have to store your intermediate steps of the chain to another variable before reassigning the columns.

# new for pandas 0.21+
df.some_method1()
  .some_method2()
  .set_axis()
  .some_method3()

# old way
df1 = df.some_method1()
        .some_method2()
df1.columns = columns
df1.some_method3()
查看更多
兄弟一词,经得起流年.
4楼-- · 2019-01-12 17:15
data.rename(columns={'gdp':'log(gdp)'}, inplace=True)

The rename show that it accepts a dict as a param for columns so you just pass a dict with a single entry.

Also see related

查看更多
登录 后发表回答