Using Pandas crosstab with seaborn stacked barplot

2020-02-09 05:10发布

I am trying to create a stacked barplot in seaborn with my dataframe.

I have first generated a crosstab table in pandas like so:

pd.crosstab(df['Period'], df['Mark'])

which returns:

  Mark            False  True  
Period BASELINE    583    132
       WEEK 12     721      0 
       WEEK 24     589    132 
       WEEK 4      721      0

I would like to use seaborn to create a stacked barplot for congruence, ans this is what I have used for the rest of my graphs. I have struggled to do this however as I am unable to index the crosstab.

I have been able to make the plot I want in pandas using .plot.barh(stacked=True) but no luck with seaborn. Any ideas how i can do this?

Thanks

2条回答
我想做一个坏孩纸
2楼-- · 2020-02-09 05:49

The guy who created Seaborn doesn't like stacked bar charts (but that link has a hack which uses Seaborn + Matplotlib to make them anyway).

If you're willing to accept a grouped bar chart instead of a stacked one, here's one approach:

 # first some sample data
 import numpy as np 
 import pandas as pd
 import seaborn as sns

 N = 1000
 mark = np.random.choice([True,False], N)
 periods = np.random.choice(['BASELINE','WEEK 12', 'WEEK 24', 'WEEK 4'], N)

 df = pd.DataFrame({'mark':mark,'period':periods})
 ct = pd.crosstab(df.period, df.mark)

 mark      False  True 
 period                
 BASELINE    118    111
 WEEK 12     117    149
 WEEK 24     117    130
 WEEK 4      127    131

 # now stack and reset
 stacked = ct.stack().reset_index().rename(columns={0:'value'})

 # plot grouped bar chart
 sns.barplot(x=stacked.period, y=stacked.value, hue=stacked.mark)

grouped bar chart

查看更多
甜甜的少女心
3楼-- · 2020-02-09 05:53

As you said you can use pandas to create the stacked bar plot. The argument that you want to have a "seaborn plot" is irrelevant, since every seaborn plot and every pandas plot are in the end simply matplotlib objects, as the plotting tools of both libraries are merely matplotlib wrappers.

So here is a complete solution (taking the datacreation from @andrew_reece's answer).

import numpy as np 
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

n = 500
mark = np.random.choice([True,False], n)
periods = np.random.choice(['BASELINE','WEEK 12', 'WEEK 24', 'WEEK 4'], n)

df = pd.DataFrame({'mark':mark,'period':periods})
ct = pd.crosstab(df.period, df.mark)

ct.plot.bar(stacked=True)
plt.legend(title='mark')

plt.show()

enter image description here

查看更多
登录 后发表回答