I'm looking for a way to label a stacked bar chart with percentages while the y-axis shows the original count (using ggplot). Here is a MWE for the plot without labels:
library(ggplot2)
df <- as.data.frame(matrix(nrow = 7, ncol= 3,
data = c("ID1", "ID2", "ID3", "ID4", "ID5", "ID6", "ID7",
"north", "north", "north", "north", "south", "south", "south",
"A", "B", "B", "C", "A", "A", "C"),
byrow = FALSE))
colnames(df) <- c("ID", "region", "species")
p <- ggplot(df, aes(x = region, fill = species))
p + geom_bar()
I have a much larger table and R counts quite nicely the different species for every region. Now, I would like to show both, the original count value (preferably on the y-axis) and the percentage (as label) to compare proportions of species between regions.
I tried out many things using geom_text()
but I think the main difference to other questions (e.g. this one) is that
- I do not have a separate column for y values (they are just the counts of different species per region) and
- I need the labels per region to sum up to 100% (since they are considered to represent seperate populations), not all labels of the entire plot.
Any help is much appreciated!!
As @Gregor mentioned, summarize the data separately and then feed the data summary to ggplot. In the code below, we use
dplyr
to create the summary on the fly:Update: With
dplyr
0.5 and later, you no longer need to provide a y-value to center the text within each bar. Instead you can useposition_stack(vjust=0.5)
:I agree with Johanna. You could try: