Seaborn.countplot : order categories by count, als

2019-07-29 03:55发布

问题:

So I understand how to sort in regards to a barchart (ie here). What I can not find though is how to sort the bar charts by one of the subcategories.

For example, given the following dataframe, I can get the bar plots. But what I would like to do, is have it sorted from greatest to least, by Type of Classic).

import pandas as pd

test_df = pd.DataFrame([
['Jake',    38, 'MW',   'Classic'],
['John',    38,'NW',    'Classic'],
['Sam', 34, 'SE',   'Classic'],
['Sam', 22, 'E' ,'Classic'],
['Joe', 43, 'ESE2', 'Classic'],
['Joe', 34, 'MTN2', 'Classic'],
['Joe', 38, 'MTN2', 'Classic'],
['Scott',   38, 'ESE2', 'Classic'],
['Chris',   34, 'SSE1', 'Classic'],
['Joe', 43, 'S1',   'New'],
['Paul',    34, 'NE2',  'New'],
['Joe', 38, 'MC1',  'New'],
['Joe', 34, 'NE2',  'New'],
['Nick',    38, 'MC1',  'New'],
['Al',  38, 'SSE1', 'New'],
['Al',  34, 'ME',   'New'],
['Al',  34, 'MC1',  'New'],
['Joe', 43, 'S1',   'New']], columns = ['Name','Code_A','Code_B','Type'])


import seaborn as sns
sns.set(style="darkgrid")
palette ={"Classic":"#FF9999","New":"#99CC99"}


g = sns.countplot(y="Name",
                  palette=palette,
                  hue="Type",
                  data=test_df)

So instead of:

'Joe' would be on top, followed by 'Sam', etc.

回答1:

Add the order argument. Use pandas.crosstab and sort_values to obtain this:

import pandas as pd

test_df = pd.DataFrame([
['Jake',    38, 'MW',   'Classic'],
['John',    38,'NW',    'Classic'],
['Sam', 34, 'SE',   'Classic'],
['Sam', 22, 'E' ,'Classic'],
['Joe', 43, 'ESE2', 'Classic'],
['Joe', 34, 'MTN2', 'Classic'],
['Joe', 38, 'MTN2', 'Classic'],
['Scott',   38, 'ESE2', 'Classic'],
['Chris',   34, 'SSE1', 'Classic'],
['Joe', 43, 'S1',   'New'],
['Paul',    34, 'NE2',  'New'],
['Joe', 38, 'MC1',  'New'],
['Joe', 34, 'NE2',  'New'],
['Nick',    38, 'MC1',  'New'],
['Al',  38, 'SSE1', 'New'],
['Doug',    34, 'ME',   'New'],
['Fred',    34, 'MC1',  'New'],
['Joe', 43, 'S1',   'New']], columns = ['Name','Code_A','Code_B','Type'])


import seaborn as sns
sns.set(style="darkgrid")
palette ={"Classic":"#FF9999","New":"#99CC99"}

order = pd.crosstab(test_df.Name, test_df.Type).sort_values('Classic', ascending=False).index
g = sns.countplot(y="Name",
                  palette=palette,
                  hue="Type",
                  data=test_df,
                  order=order
                 )