I have a multi-index dataframe that I've melted to look something like this:
Color Frequency variable value
Red 2-3 times a month x 22
Red A few days a week x 45
Red At least once a day x 344
Red Never x 5
Red Once a month x 1
Red Once a week x 0
Red Once every few months x 4
Blue 2-3 times a month x 4
Blue A few days a week x 49
Blue At least once a day x 200
Blue Never x 7
Blue Once a month x 19
Blue Once a week x 10
Blue Once every few months x 5
Red 2-3 times a month y 3
Red A few days a week y 97
Red At least once a day y 144
Red Never y 4
Red Once a month y 0
Red Once a week y 0
Red Once every few months y 4
Blue 2-3 times a month y 44
Blue A few days a week y 62
Blue At least once a day y 300
Blue Never y 2
Blue Once a month y 4
Blue Once a week y 23
Blue Once every few months y 6
Red 2-3 times a month z 4
Red A few days a week z 12
Red At least once a day z 101
Red Never z 0
Red Once a month z 0
Red Once a week z 10
Red Once every few months z 0
Blue 2-3 times a month z 100
Blue A few days a week z 203
Blue At least once a day z 299
Blue Never z 0
Blue Once a month z 0
Blue Once a week z 204
Blue Once every few months z 100
I'm trying to make a seaborn plot where there are two categories for the x-axis variable
and Frequency
and the hue is based on Color
. Moreover, I want the y-axis to be the proportion of value
over the sum of the values for that variable
for each Color
; e.g. the y-value for variable "x.2-3 times a month" should be 22/(22+45+344+5+1+0+4) or 5.22%.
So far I have this:
import seaborn as sns
fig, ax1 = plt.subplots(figsize=(20, 10))
sns.factorplot(x='variable',y='value', hue='Frequency', data=df, kind='bar', ax=ax1)
This is part of the way there. How do I also groupby 1) Color and 2) take the proportion of values for each variable
& Frequency
, rather than the count?
This is what you need to find the portion of each number for that group:
Output: