Plot hline at mean with geom_bar and stat=“identit

2019-07-25 11:23发布

问题:

I have a barplot where the exact bar heights are in the dataframe.

df <- data.frame(x=LETTERS[1:6], y=c(1:6, 1:6 + 1), g=rep(x = c("a", "b"), each=6))

ggplot(df, aes(x=x, y=y, fill=g, group=g)) + 
  geom_bar(stat="identity", position="dodge")

Now I want to add two hlines displaying the mean of all bars per group. All I get with

ggplot(df, aes(x=x, y=y, fill=g, group=g)) + 
  geom_bar(stat="identity", position="dodge") +
  stat_summary(fun.y=mean, aes(yintercept=..y.., group=g), geom="hline")

is

As I want to do this for a arbitrary number of groups as well, I would appreciate a solution with ggplot only.

I want to avoid a solution like this, because it does not rely purely on the dataset passed to ggplot, has redundant code and is not flexible in the number of groups:

ggplot(df, aes(x=x, y=y, fill=g, group=g)) + 
  geom_bar(stat="identity", position="dodge") +
  geom_hline(yintercept=mean(df$y[df$g=="a"]), col="red") +
  geom_hline(yintercept=mean(df$y[df$g=="b"]), col="green")

Thanks in advance!

Edits:

  • added dataset
  • comment on resulting code
  • changed the data and plots to clarify the question

回答1:

If I understand your question correctly, your first approach is almost there:

ggplot(df, aes(x = x, y = y, fill = g, group = g)) + 
  geom_col(position="dodge") + # geom_col is equivalent to geom_bar(stat = "identity")
  stat_summary(fun.y = mean, aes(x = 1, yintercept = ..y.., group = g), geom = "hline")

According to the help file for stat_summary:

stat_summary operates on unique x; ...

In this case, stat_summary has inherited the top level aesthetic mappings of x = x and group = g by default, so it would calculate the mean y value at each x for each value of g, resulting in a lot of horizontal lines. Adding x = 1 to stat_summary's mapping overrides x = x (while retaining group = g), so we get a single mean y value for each value of g instead.