I have the following data
id <- 1:80
gyrA <- sample(c(1,0), 80, replace = TRUE)
parC <- sample(c(1,0), 80, replace = TRUE)
marR <- sample(c(1,0), 80, replace = TRUE)
qnrS <- sample(c(1,0), 80, replace = TRUE)
marA <- sample(c(1,0), 80, replace = TRUE)
ydhE <- sample(c(1,0), 80, replace = TRUE)
qnrA <- sample(c(1,0), 80, replace = TRUE)
qnrB <- sample(c(1,0), 80, replace = TRUE)
qnrD <- sample(c(1,0), 80, replace = TRUE)
mcbE <- sample(c(1,0), 80, replace = TRUE)
oqxAB <- sample(c(1,0), 80, replace = TRUE)
species <- sample(c("Wild bird","Pig","Red Fox","Broiler"), 80, replace = TRUE)
test_data <- data.frame(id,species,gyrA,parC,marR,marA,qnrS,qnrA,qnrB,qnrD,ydhE,mcbE,oqxAB)
library(dplyr)
plot_data <- test_data %>%
gather(key = "gene", value = "value", -id) %>%
mutate(id = factor(id, levels = unique(id)),
gene = factor(gene, levels = unique(gene)))
I want to create a heatmap with presence/absence of the genes in the data. However, I also want a column with the species in the same plot. I gathered all the presence/absence columns (gyrA, parC etc.) into one column.
I have managed to create the heatmap, but not with species included. Preferrably i want to add columns with any data I might get later on related to these samples.
The plot:
ggplot(plot_data, aes(gene, id, fill = value))+
geom_tile(color = "black")+
theme_classic()
How do I add a column with species to the plot, so that it looks like this?
Is there any simple way to do this? If easier, is it possible to at least create a column with text that says which species is represented at each row?
EDIT
Based on his/her comment, I have adapted the sample data to reflect the actual question of the OP.
With clustered
species
for readability:Something a bit closer to what the OP would like using some workarounds (but I think the resulting figure is less clear than the first one).
Sample data