I'm attempting to plot an overview of interesting segments in different protein sequences faceted by organism.
Each facet/organism may contain different numbers of proteins - long black segments. Each protein has a colored overlay of shorter segments of equal length which may overlap - the color signifying the patient group.
I first encountered problems with differences in the spacing between the proteins/segments on the y-axes in the different the facets. I managed to solve this with the ggplot: coord_fixed
function using a specified ratio. However, still the heights of the y-axes in each facet do not fit the number of segments. In addition, the coord_fixed
throws an error when trying facet_wrap(scales = "free_y")
, as it does not allow for free axes.
How can I remove the extra spacing on the y-axes / control the height of the y-axes within each facet?
Here is some sample code:
library(ggplot2)
library(dplyr)
d_list <- lapply(paste("protein", seq(1,100,1)), function(protein){
#The full length the protein
prot_length <- sample(seq(100,500,1), size = 1)
#The organism the protein belongs to
org_name <- sample(paste("organism", seq(1,5,1), sep = "_"), 1)
#The start and end of the segments of interest - 15 amino acids long
start <- sample(seq(1,prot_length-14,1),sample(1:20,1))
end <- start + 14
#The patient/group the segments of interest originate from
group <- sample(paste("patient", seq(1,3,1), sep = "_"), length(start), T)
data.frame(protein_name = rep(protein,length(start)),
protein_length = rep(prot_length, length(start)),
start = start,
end = end,
organism_name = rep(org_name,length(start)),
group = group)
})
d <- do.call("rbind", sample(d_list, 20))
d %>%
arrange(., organism_name, desc(protein_length)) %>%
mutate(., protein_name = factor(protein_name, levels = unique(protein_name))) %>%
ggplot(., aes(x = 1, xend = protein_length, y = protein_name, yend = protein_name)) +
geom_segment(color = rgb(0,0,0), size = 1) +
geom_segment(aes(x = start, xend = end, y = protein_name, yend = protein_name, color = as.factor(group)),
size = 0.7) +
scale_x_continuous(breaks = seq(0,500,100), labels = seq(0,500,100)) +
scale_y_discrete(label = NULL, drop = T) +
scale_color_manual(values = c("firebrick1", "dodgerblue1", "darkgoldenrod1")) +
facet_wrap(~organism_name, ncol = 1, drop = T) +
theme_minimal() +
labs(color = "Group", y = "Proteins", x = "Amino Acid Position") +
theme(axis.title.x = element_text(size = 15, face = "bold", vjust = 0.5),
axis.text.x = element_text(size = 12),
panel.grid.minor.x = element_blank(),
axis.title.y = element_text(size = 15, face = "bold", vjust = 0.5),
panel.grid.major.y = element_blank(),
panel.grid.minor.y = element_blank(),
legend.title = element_text(size = 15, face = "bold"),
legend.text = element_text(size = 12)) +
coord_fixed(ratio = 2)
Edit to combine
facet_wrap
's strip positions withfacet_grid
's free panel sizes(Note: I increased the segment sizes because they were really hard to see...)
Original answer
Sounds like you might be looking for
facet_grid
instead offacet_wrap
. It allows axis breaks and facet heights to vary if you set bothscales
&space
to"free_y"
: