How to change the order of DataFrame columns?

2018-12-31 19:39发布

I have the following DataFrame (df):

import numpy as np
import pandas as pd

df = pd.DataFrame(np.random.rand(10, 5))

I add more column(s) by assignment:

df['mean'] = df.mean(1)

How can I move the column mean to the front, i.e. set it as first column leaving the order of the other columns untouched?

标签: python pandas
25条回答
明月照影归
2楼-- · 2018-12-31 19:52

How about using "T"?

df.T.reindex(['mean',0,1,2,3,4]).T
查看更多
明月照影归
3楼-- · 2018-12-31 19:52

set():

A simple approach is using set(), in particular when you have a long list of columns and do not want to handle them manually:

cols = list(set(df.columns.tolist()) - set(['mean']))
cols.insert(0, 'mean')
df = df[cols]
查看更多
明月照影归
5楼-- · 2018-12-31 19:54

I believe @Aman's answer is the best if you know the location of the other column.

If you don't know the location of mean, but only have its name, you cannot resort directly to cols = cols[-1:] + cols[:-1]. Following is the next-best thing I could come up with:

meanDf = pd.DataFrame(df.pop('mean'))
# now df doesn't contain "mean" anymore. Order of join will move it to left or right:
meanDf.join(df) # has mean as first column
df.join(meanDf) # has mean as last column
查看更多
深知你不懂我心
6楼-- · 2018-12-31 19:54

I liked Shoresh's answer to use set functionality to remove columns when you don't know the location, however this didn't work for my purpose as I need to keep the original column order (which has arbitrary column labels).

I got this to work though by using IndexedSet from the boltons package.

I also needed to re-add multiple column labels, so for a more general case I used the following code:

from boltons.setutils import IndexedSet
cols = list(IndexedSet(df.columns.tolist()) - set(['mean', 'std']))
cols[0:0] =['mean', 'std']
df = df[cols]

Hope this is useful to anyone searching this thread for a general solution.

查看更多
流年柔荑漫光年
7楼-- · 2018-12-31 19:55

This question has been answered before but reindex_axis is deprecated now so I would suggest to use:

df.reindex(sorted(df.columns), axis=1)
查看更多
登录 后发表回答