Get column names for max values over a certain row

2020-03-26 06:25发布

In the DataFrame

import pandas as pd 
df=pd.DataFrame({'col1':[1,2,3],'col2':[3,2,1],'col3':[1,1,1]},index= ['row1','row2','row3'])
print df
       col1  col2  col3
row1     1     3     1
row2     2     2     1
row3     3     1     1

I want to get the column names of the cells with the max value(s) over a certain row.

The desired output would be (in pseudocode):

get_column_name_for_max_values_of(row2)
>['col1','col2']

What would be the most concise way to express

get_column_name_for_max_values_of(row2)

?

标签: python pandas
2条回答
▲ chillily
2楼-- · 2020-03-26 06:52

you could also use apply and create a method such has:

def returncolname(row, colnames):
    return colnames[np.argmax(row.values)]

df['colmax'] = df.apply(lambda x: returncolname(x, df.columns), axis=1)

Out[62]: 
row1    col2
row2    col1
row3    col1
dtype: object

an you can use df.max(axis=1) to extract maxes

df.max(axis=1)
Out[69]: 
row1    3
row2    2
row3    3
查看更多
贼婆χ
3楼-- · 2020-03-26 06:58

If not duplicates, you can use idxmax, but it return only first column of max value:

print (df.idxmax(1))
row1    col2
row2    col1
row3    col1
dtype: object

def get_column_name_for_max_values_of(row):
    return df.idxmax(1).ix[row]

print (get_column_name_for_max_values_of('row2'))
col1

But with duplicates use boolean indexing:

print (df.ix['row2'] == df.ix['row2'].max())
col1     True
col2     True
col3    False
Name: row2, dtype: bool

print (df.ix[:,df.ix['row2'] == df.ix['row2'].max()])
      col1  col2
row1     1     3
row2     2     2
row3     3     1

print (df.ix[:,df.ix['row2'] == df.ix['row2'].max()].columns)
Index(['col1', 'col2'], dtype='object')

And function is:

def get_column_name_for_max_values_of(row):
    return df.ix[:,df.ix[row] == df.ix[row].max()].columns.tolist()

print (get_column_name_for_max_values_of('row2'))
['col1', 'col2']
查看更多
登录 后发表回答