Hi I am relatively new in R / ggplot2
and I would like to ask for some advice on how to create a plot that looks like this:
Explanation: A diverging bar plot showing biological functions with genes that have increased expression (yellow) pointing towards the right, as well as genes with reduced expression (purple) pointing towards the left. The length of the bars represent the number of differentially expressed genes, and color intensity vary according to their p-values.
Note that the x-axis must be 'positive' in both directions. (In published literature on gene expression experimental studies, bars that point towards the left represent genes that have reduced expression, and right to show genes that have increased expression. The purpose of the graph is not to show the "magnitude" of change (which would give rise to positive and negative values). Instead, we are trying to plot the NUMBER of genes that have changes of expression, therefore cannot be negative)
I have tried ggplot2
but fails completely to reproduce the graph that is shown.
Here is the data which I am trying to plot: Click here for link
> dput(sample)
structure(list(Name = structure(c(15L, 19L, 5L, 11L, 8L, 6L,
16L, 13L, 17L, 1L, 3L, 2L, 14L, 18L, 7L, 12L, 10L, 9L, 4L, 20L
), .Label = c("Actin synthesis", "Adaptive immunity", "Antigen presentation",
"Autophagy", "Cell cycle", "Cell division", "Cell polarity",
"DNA repair", "Eye development", "Lipid metabolism", "Phosphorylation",
"Protein metabolism", "Protein translation", "Proteolysis", "Replication",
"Signaling", "Sumoylation", "Trafficking", "Transcription", "Translational initiation"
), class = "factor"), Trend_in_AE = structure(c(2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L
), .Label = c("Down", "Up"), class = "factor"), Count = c(171L,
201L, 38L, 63L, 63L, 47L, 22L, 33L, 20L, 16L, 16L, 7L, 10L, 4L,
13L, 15L, 5L, 7L, 9L, 7L), PValue = c(1.38e-08, 1.22e-06, 1.79e-06,
2.89e-06, 0.000122, 0.000123, 0.00036, 0.000682, 0.001030253,
0.001623939, 7.76e-05, 0.000149, 0.000734, 0.001307039, 0.00292414,
0.003347556, 0.00360096, 0.004006781, 0.007330264, 0.010083734
)), .Names = c("Name", "Trend_in_AE", "Count", "PValue"), class = "data.frame", row.names = c(NA,
-20L))
Thank you very much for your help and suggestions, this is really help with my learning.
My own humble attempt was this:
table <- read.delim("file.txt", header = T, sep = "\t")
library(ggplot2)
ggplot(aes(x=Number, y=Names)) +
geom_bar(stat="identity",position="identity") +
xlab("number of genes") +
ylab("Name"))
Result was error message regarding the aes