How to change the order of DataFrame columns?

2018-12-31 19:39发布

I have the following DataFrame (df):

import numpy as np
import pandas as pd

df = pd.DataFrame(np.random.rand(10, 5))

I add more column(s) by assignment:

df['mean'] = df.mean(1)

How can I move the column mean to the front, i.e. set it as first column leaving the order of the other columns untouched?

标签: python pandas
25条回答
深知你不懂我心
2楼-- · 2018-12-31 20:05

Simply do,

df = df[['mean'] + df.columns[:-1].tolist()]
查看更多
琉璃瓶的回忆
3楼-- · 2018-12-31 20:05

This function avoids you having to list out every variable in your dataset just to order a few of them.

def order(frame,var):
    if type(var) is str:
        var = [var] #let the command take a string or list
    varlist =[w for w in frame.columns if w not in var]
    frame = frame[var+varlist]
    return frame 

It takes two arguments, the first is the dataset, the second are the columns in the data set that you want to bring to the front.

So in my case I have a data set called Frame with variables A1, A2, B1, B2, Total and Date. If I want to bring Total to the front then all I have to do is:

frame = order(frame,['Total'])

If I want to bring Total and Date to the front then I do:

frame = order(frame,['Total','Date'])

EDIT:

Another useful way to use this is, if you have an unfamiliar table and you're looking with variables with a particular term in them, like VAR1, VAR2,... you may execute something like:

frame = order(frame,[v for v in frame.columns if "VAR" in v])
查看更多
听够珍惜
4楼-- · 2018-12-31 20:06

The simplest way would be to change the order of the column names like this

df = df[['mean', Col1,Col2,Col3]]

查看更多
梦寄多情
5楼-- · 2018-12-31 20:07

I ran into a similar question myself, and just wanted to add what I settled on. I liked the reindex_axis() method for changing column order. This worked:

df = df.reindex_axis(['mean'] + list(df.columns[:-1]), axis=1)

An alternate method based on the comment from @Jorge:

df = df.reindex(columns=['mean'] + list(df.columns[:-1]))

Although reindex_axis seems to be slightly faster in micro benchmarks than reindex, I think I prefer the latter for its directness.

查看更多
只若初见
6楼-- · 2018-12-31 20:08

Here is a function to do this for any number of columns.

def mean_first(df):
    ncols = df.shape[1]        # Get the number of columns
    index = list(range(ncols)) # Create an index to reorder the columns
    index.insert(0,ncols)      # This puts the last column at the front
    return(df.assign(mean=df.mean(1)).iloc[:,index]) # new df with last column (mean) first
查看更多
无与为乐者.
7楼-- · 2018-12-31 20:10

In your case,

df = df.reindex_axis(['mean',0,1,2,3,4], axis=1)

will do exactly what you want.

In my case (general form):

df = df.reindex_axis(sorted(df.columns), axis=1)
df = df.reindex_axis(['opened'] + list([a for a in df.columns if a != 'opened']), axis=1)

update Jan 2018

If you want to use reindex:

df = df.reindex(columns=sorted(df.columns))
df = df.reindex(columns=(['opened'] + list([a for a in df.columns if a != 'opened']) ))
查看更多
登录 后发表回答