I'm plotting a stack barplot in ggplot2. My dataset is like,
var1 var2 var3 value
treatment1 group_1 C8.0 0.010056478
treatment2 group_1 C8.0 0.009382918
treatment3 group_2 C8.0 0.003014983
treatment4 group_2 C8.0 0.005349631
treatment5 group_2 C8.0 0.005349631
var1
contains 5 treatments, these five treatments belong to two groups in var2
, and each treatment has 14 measurements in var3
, their value stored in value
.
I want to make a plot to compare these five treatments, and their measurements.
so I plot with stack bar plot like this figure:
My code:
library(ggplot2)
colourCount = length(unique(mydata$var3))
getPalette = colorRampPalette(brewer.pal(14, "YlGnBu")) #get more color from palette
ggplot(data=mydata, aes(x=var1, y=value, fill=var3))+
geom_bar(stat="identity", position="stack", colour="black", width=.2)+
*#geom_errorbar(aes(ymax=var3+se, ymin=var3-se, width=.1))+*
scale_fill_manual(values = getPalette(colourCount))+
scale_y_continuous(expand = c(0, 0))+
mytheme
How could I group the first two stacked columns together, and the other three columns together? Because they belong to two groups in var2
.
The "duplicate question" comments above will lead you to an answer like this one:
library(dplyr)
library(ggplot2)
dummydf <- expand.grid(var1 = paste0("trt", 1:5),
var3 = paste0("C_", 11:15)) %>%
mutate(value = runif(length(var1)),
var2 = ifelse(var1 %in% c("trt1", "trt2"), "grp1", "grp2"))
ggplot(dummydf, aes(var1, value, fill = var3)) +
geom_col(position = "stack") +
facet_grid(~var2, scales = "free_x", space = "free_x")
And this solution is sometimes great! The advantages are:
- it's simple to implement
- contains the labels for the hierarchical grouping at the top
- generally looks nice
- is easily customizable.
For example:
ggplot(dummydf, aes(var1, value, fill = var3)) +
geom_col(position = "stack") +
facet_grid(~var2, scales = "free_x", space = "free_x") +
theme(panel.spacing = unit(3, "cm"),
strip.text = element_text(size = 12, family = "mono"))
The main disadvantages to this method:
- If this is already part of a facet panelled plot, it makes the
whole thing cluttered.
- If your hierarchical groups are obvious from the treatments, you may not need explicit labelling, just a quick visual distinction.
For example, say the groups are control/intervention, and your
treatments are "no drug, placebo" and "drugs 1, 2, and 3".
So here's an alternate method:
dummydf %>%
bind_rows(data_frame(var1 = "trt99")) %>%
ggplot(aes(var1, value, fill = var3)) +
geom_col(position = "stack") +
scale_x_discrete(limits = c("trt1", "trt2", "trt99", "trt3", "trt4", "trt5"),
breaks = c("trt1", "trt2", NA, "trt3", "trt4", "trt5"),
labels = c("trt1", "trt2", "", "trt3", "trt4", "trt5"))
This solution has its own drawbacks, primarily that you can only customize the space in a limited way. You can create a "false" bar equal to an integer multiple of the widths of the bars you've already got by adding additional false levels to your limits, breaks, and labels. But you can't create a space that's only half a bar wide.
You could provide additional information in the false bar space though:
- Adding a text annotation in the plot area
- Replacing the
NA
and ""
in breaks
and labels
with trt99
and "<-group1 | group2->"
or something similar.
I think you just need to create a new column to your data set which labels everything is "treatment3" and "not treatment3". I used dplyr
package:
df1 = dplyr::mutate(mydata,
var4 = ifelse(var1 == "treatment3", "treatment3", "not treatmeant3"))