I would like to use the geo_bar
with facets, obtaining percentage instead of absolute counts, but percentage should be relative to each facet, not relative to the overall count.
This has been discussed a lot (example), suggesting to use geom_bar(aes(y = (..count..)/sum(..count..)))
. This won't work with facets (i.e. will give total count). A better solution has been suggested,
using stat_count(mapping = aes(x=x_val, y=..prop..))
instead.
This seems to work if x
is numeric, but not if x
is character: all bars are 100%! Why? Am I doing something wrong? Thanks!
library(tidyverse)
df <- data_frame(val_num = c(rep(1, 60), rep(2, 40), rep(1, 30), rep(2, 70)),
val_cat = ifelse(val_num==1, "cat", "mouse"),
group=rep(c("A", "B"), each=100))
#works with numeric
ggplot(df) + stat_count(mapping = aes(x=val_num, y=..prop..)) + facet_grid(group~.)
# does not work?
ggplot(df) + stat_count(mapping = aes(x=val_cat, y=..prop..)) + facet_grid(group~.)
Adding
group=group
tells ggplot to calculate proportions bygroup
, rather than the default, which would be separately for each level ofval_cat
.When the x-variable is continuous, it looks like
stat_count
by default calculates percentages over all data in the facet. However, when the x-variable is categorical,stat_count
calculates percentages separately within each x level. See what happens with the following examples:Adding
val_num
as the group aesthetic causes percentages to be calculated within each x level instead of over all values in a facet.Turning
val_num
into a factor likewise causes percentages to be calculated within each x level instead of over all values in a facet.