Adding columns with metadata to geom_tile ggplot

2019-08-31 10:05发布

I have the following data

id <- 1:80
gyrA <- sample(c(1,0), 80, replace = TRUE)
parC <- sample(c(1,0), 80, replace = TRUE)
marR <- sample(c(1,0), 80, replace = TRUE)
qnrS <- sample(c(1,0), 80, replace = TRUE)
marA <- sample(c(1,0), 80, replace = TRUE)
ydhE <- sample(c(1,0), 80, replace = TRUE)
qnrA <- sample(c(1,0), 80, replace = TRUE)
qnrB <- sample(c(1,0), 80, replace = TRUE)
qnrD <- sample(c(1,0), 80, replace = TRUE)
mcbE <- sample(c(1,0), 80, replace = TRUE)
oqxAB <- sample(c(1,0), 80, replace = TRUE)
species <- sample(c("Wild bird","Pig","Red Fox","Broiler"), 80, replace = TRUE)

test_data <- data.frame(id,species,gyrA,parC,marR,marA,qnrS,qnrA,qnrB,qnrD,ydhE,mcbE,oqxAB)


library(dplyr)

plot_data <- test_data %>%
  gather(key = "gene", value = "value", -id) %>%
  mutate(id = factor(id, levels = unique(id)),
         gene = factor(gene, levels = unique(gene)))

I want to create a heatmap with presence/absence of the genes in the data. However, I also want a column with the species in the same plot. I gathered all the presence/absence columns (gyrA, parC etc.) into one column.

I have managed to create the heatmap, but not with species included. Preferrably i want to add columns with any data I might get later on related to these samples.

The plot:

ggplot(plot_data, aes(gene, id, fill = value))+
  geom_tile(color = "black")+
  theme_classic()

How do I add a column with species to the plot, so that it looks like this? enter image description here

Is there any simple way to do this? If easier, is it possible to at least create a column with text that says which species is represented at each row?

标签: r ggplot2
1条回答
做自己的国王
2楼-- · 2019-08-31 10:34

EDIT

Based on his/her comment, I have adapted the sample data to reflect the actual question of the OP.

colors <- c("#b13da1", "#00b551" , "#fff723" , "#ff0022")

plot_data$label <- paste("1 -", as.character(plot_data$species))
plot_data$label[plot_data$value==0] <- "0"

ggplot(plot_data, aes(gene, id, fill = label))+
  geom_tile(color = "black")+
  theme_classic()+
  scale_fill_manual(values = c("white", colors), "Value")+
  theme(
    axis.line = element_blank(),
    axis.ticks = element_blank()) +
  xlab("Gene") + ylab("id")

enter image description here

With clustered species for readability:

library(forcats)

ggplot(plot_data, aes(gene, fct_reorder(id, as.numeric(species)), fill = label))+
  geom_tile(color = "black")+
  theme_classic()+
  scale_fill_manual(values = c("white", colors), "Value")+
  theme(
    axis.line = element_blank(),
    axis.ticks = element_blank()) +
  xlab("Gene") + ylab("id")

enter image description here


Something a bit closer to what the OP would like using some workarounds (but I think the resulting figure is less clear than the first one).

newdata <- plot_data[1:10,]
newdata$gene <- "Species"
newdata$value <- newdata$species
plot_data <- rbind(plot_data, newdata)

plot_data$value <- as.factor(plot_data$value)
levels(plot_data$value) <- c(levels(plot_data$value ), "") # add artificial levels to split the legend into 2 columns
levels(plot_data$value) <- c(levels(plot_data$value ), " ") 
plot_data$value <- factor(plot_data$value, levels(plot_data$value)[c(1,2,7,8,3:6)])
plot_data$gene <- factor(plot_data$gene, levels(plot_data$gene)[c(12, 1:11)])

colors <- c("#b13da1", "#00b551" , "#fff723" , "#ff0022")

ggplot(plot_data, aes(gene, id, fill = value))+
  geom_tile()+
  geom_tile(color = "black",show.legend = F)+
  theme_classic()+
  scale_fill_manual(values = c("#403f3f", "grey","white","white", 
  colors), "Value Species", drop=FALSE)+
  theme(
    axis.line = element_blank(),
    axis.ticks = element_blank()) +
  guides(fill = guide_legend(ncol=2)) +
  xlab("Gene") + ylab("id")+
  scale_x_discrete(position = "top") 

enter image description here

Sample data

test_data <- test_data[1:10,]

library(dplyr)

plot_data <- test_data %>%
  gather(key = "gene", value = "value", -c(id, species)) %>%
  mutate(id = factor(id, levels = 1:10),
         gene = factor(gene, levels = unique(gene)),
         value = factor(value, levels = c(1,0)))
查看更多
登录 后发表回答