Show % instead of counts in charts of categorical

2018-12-31 14:15发布

I'm plotting a categorical variable and instead of showing the counts for each category value.

I'm looking for a way to get ggplot to display the percentage of values in that category. Of course, it is possible to create another variable with the calculated percentage and plot that one, but I have to do it several dozens of times and I hope to achieve that in one command.

I was experimenting with something like

qplot(mydataf) +
  stat_bin(aes(n = nrow(mydataf), y = ..count../n)) +
  scale_y_continuous(formatter = "percent")

but I must be using it incorrectly, as I got errors.

To easily reproduce the setup, here's a simplified example:

mydata <- c ("aa", "bb", NULL, "bb", "cc", "aa", "aa", "aa", "ee", NULL, "cc");
mydataf <- factor(mydata);
qplot (mydataf); #this shows the count, I'm looking to see % displayed.

In the real case, I'll probably use ggplot instead of qplot, but the right way to use stat_bin still eludes me.

I've also tried these four approaches:

ggplot(mydataf, aes(y = (..count..)/sum(..count..))) + 
  scale_y_continuous(formatter = 'percent');

ggplot(mydataf, aes(y = (..count..)/sum(..count..))) + 
  scale_y_continuous(formatter = 'percent') + geom_bar();

ggplot(mydataf, aes(x = levels(mydataf), y = (..count..)/sum(..count..))) + 
  scale_y_continuous(formatter = 'percent');

ggplot(mydataf, aes(x = levels(mydataf), y = (..count..)/sum(..count..))) + 
  scale_y_continuous(formatter = 'percent') + geom_bar();

but all 4 give:

Error: ggplot2 doesn't know how to deal with data of class factor

The same error appears for the simple case of

ggplot (data=mydataf, aes(levels(mydataf))) +
  geom_bar()

so it's clearly something about how ggplot interacts with a single vector. I'm scratching my head, googling for that error gives a single result.

标签: r ggplot2
9条回答
孤独总比滥情好
2楼-- · 2018-12-31 14:57

If you want percentages on the y-axis and labeled on the bars:

library(ggplot2)
library(scales)
ggplot(mtcars, aes(x = as.factor(am))) +
  geom_bar(aes(y = (..count..)/sum(..count..))) +
  geom_text(aes(y = ((..count..)/sum(..count..)), label = scales::percent((..count..)/sum(..count..))), stat = "count", vjust = -0.25) +
  scale_y_continuous(labels = percent) +
  labs(title = "Manual vs. Automatic Frequency", y = "Percent", x = "Automatic Transmission")

enter image description here

When adding the bar labels, you may wish to omit the y-axis for a cleaner chart, by adding to the end:

  theme(
        axis.text.y=element_blank(), axis.ticks=element_blank(),
        axis.title.y=element_blank()
  )

enter image description here

查看更多
其实,你不懂
3楼-- · 2018-12-31 14:58

With ggplot2 version 2.1.0 it is

+ scale_y_continuous(labels = scales::percent)
查看更多
荒废的爱情
4楼-- · 2018-12-31 15:00

this modified code should work

p = ggplot(mydataf, aes(x = foo)) + 
    geom_bar(aes(y = (..count..)/sum(..count..))) + 
    scale_y_continuous(formatter = 'percent')

if your data has NAs and you dont want them to be included in the plot, pass na.omit(mydataf) as the argument to ggplot.

hope this helps.

查看更多
登录 后发表回答