R - Aggregate Percentage for Stacked Bar Charts us

2020-03-31 11:35发布

I have some data that looks like the below. I'm aiming to generate stacked bar charts for them, but I need the values to be shown as percentages. I've managed to get as far as getting the data melted to the right shape and drawing the stacked bars, but the values go far beyond 100% (in my actual dataset, some values add up to 8000+). What is the correct way to setup ggplot2 so that I can create stacked bar charts in percentages?

#Raw Data
x   A    B    C
1   5   10   14
1   4    4   14
2   5   10   14
2   4    4   14
3   5   10   14
3   4    4   14

#Aggregate
data < read.table(...); 
data <- aggregate(. ~ x, data, sum) #<---- Sum to Average? 
x   A    B    C
1   9   14   28
2   9   14   28
3   9   14   28

#Melt Data
data <- melt(data,"x")
  x variable value
1 1        A     9
2 2        A     9
3 3        A     9
4 1        B    14
5 2        B    14
6 3        B    14
7 1        C    28
8 2        C    28
9 3        C    28

#Plot stack bar chart counts
ggplot(data, aes(x=1, y=value, fill=variable)) + geom_bar(stat="identity") + facet_grid(.~x)

I'm hoping to get something like this before the melt so that I can melt it and plot that as a stacked bar chart, but I'm not sure how to approach this.

#Ideal Data Format - After Aggregate, Before Melt
x     A       B       C
1   17.64   27.45   54.90
2   17.64   27.45   54.90
3   17.64   27.45   54.90

Q: What is the correct way to create a stacked bar chart with percentages, using ggplot2?


标签: r ggplot2
1条回答
我只想做你的唯一
2楼-- · 2020-03-31 12:40

You can calculate proportion using your melt data. Then, you can draw a figure. Here, you can calculate proportion for each level of x using group_by in the dplyr package. You have other options as well. If you wanna read the mutate line, it is like "For each level of x, I want to get percent." In order to to remove the grouped variable, which is x, I added ungroup() in the end.

library(dplyr)
library(ggplot2)

### foo is your melt data
ana <- mutate(group_by(foo, x), percent = value / sum(value) * 100) %>%
       ungroup()

### Plot once
bob <- ggplot(data = ana, aes(x = x, y = percent, fill = variable)) +
       geom_bar(stat = "identity") +
       labs(y = "Percentage (%)")

### Get ggplot data
caroline <- ggplot_build(bob)$data[[1]]

### Create values for text positions
caroline$position = caroline$ymax + 1

### round up numbers and convert to character. Get unique values
foo <- unique(as.character(round(ana$percent, digits = 2)))

### Create a column for text
caroline$label <- paste(foo,"%", sep = "")

### Plot again
bob + annotate(x = caroline$x, y = caroline$position,
               label = caroline$label, geom = "text", size=3) 

enter image description here

DATA

foo <-structure(list(x = c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), variable = structure(c(1L, 
1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L), .Label = c("A", "B", "C"), class = "factor"), 
value = c(9L, 9L, 9L, 14L, 14L, 14L, 28L, 28L, 28L)), .Names = c("x", 
"variable", "value"), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6", "7", "8", "9"))
查看更多
登录 后发表回答