name of column, that contains the max value

2019-03-01 06:44发布

问题:

I have dataframe that looks like:

Alice          Eleonora    Mike     Helen
   2               7          8       6                 
   11              5          9       4
   6              15         12       3
   5               3          7       8

I want ot create the new column that containes for each row the name of the column with max value for given row

Alice          Eleonora    Mike     Helen    _Max
   2               7          8       6        Mike         
   11              5          9       4        Alice
   6              15         12       3        Eleonora
   5               3          7       8        Helen

I figure out how to get the max value:

df['_Max']=df[['Alice', 'Eleonora', 'Mike', 'Helen']].max(axis=1)

but how to get the name of column with max value and write it into _Max instead of value itself?

回答1:

You can use apply with a lambda to return the name of the column, here we compare the value row-wise against the max, this produces a boolean mask we can use to mask the columns:

In [229]:
df['MAX'] = df.apply( lambda x: df.columns[x == x.max()][0], axis=1)
df

Out[229]:
   Alice  Eleonora  Mike  Helen       MAX
0      2         7     8      6      Mike
1     11         5     9      4     Alice
2      6        15    12      3  Eleonora
3      5         3     7      8     Helen

Here is the boolean mask:

In [232]:
df.apply( lambda x: x == x.max(), axis=1)

Out[232]:
   Alice Eleonora   Mike  Helen
0  False    False   True  False
1   True    False  False  False
2  False     True  False  False
3  False    False  False   True