I have a data frame of 379838 rows and 13 variables in columns( 13 clinical samples) :
> str( df)
'data.frame': 379838 obs. of 13 variables:
$ V1 : num 0.8146 0.7433 0.0174 0.177 0 ...
$ V2 : num 0.7465 0.5833 0.0848 0.5899 0.0161 ...
$ V3 : num 0.788 0.843 0.333 0.801 0.156 ...
$ V4 : num 0.601 0.958 0.319 0.807 0.429 ...
$ V5 : num 0.792 0.49 0.341 0.865 1 ...
$ V6 : num 0.676 0.801 0.229 0.822 0.282 ...
$ V7 : num 0.783 0.732 0.223 0.653 0.507 ...
$ V8 : num 0.69 0.773 0.108 0.69 0.16 ...
$ V9 : num 0.4014 0.5959 0.0551 0.7578 0.2784 ...
$ V10: num 0.703 0.784 0.131 0.698 0.204 ...
$ V11: num 0.6731 0.8224 0.125 0.6021 0.0772 ...
$ V12: num 0.7889 0.7907 0.0881 0.7175 0.2392 ...
$ V13: num 0.6731 0.8221 0.0341 0.4059 0 ...
and I am trying to make a ggplot2 box plot grouping variables into three groups: V1-V5 , V6-V9 and V10-V13 and assigning different color to variables of each group.
I am trying the following code:
df1= as.vector(df[, c("V1", "V2", "V3","V4", "V5")])
df2= as.vector(df[, c("V6","V7", "V8","V9")])
df3=as.vector(df[, c( "V10","V11", "V12","V13")])
sample= c(df1,df2,df3)
library(reshape2)
meltData1 <- melt(df, varnames="sample")
str(meltData1)
'data.frame': 4937894 obs. of 2 variables:
$ variable: Factor w/ 13 levels "V1","V2","V3",..: 1 1 1 1 1 1 1 1 1 1 ...
$ value : num 0.8146 0.7433 0.0174 0.177 0 ...
p=ggplot(data=meltData1,aes(variable,value, fill=x$sample))
p+geom_boxplot()
That gives me white box plots. How can I assign a colour to three groups of variables? Many thanks in advance!
As sample data were not provided, made new data frame containing 13 columns with names from V1
to V13
.
df<-as.data.frame(matrix(rnorm(1300),ncol=13))
With function melt()
from library reshape2
data are transformed from wide to long format. Now data frame has two columns: variable
and value
.
library(reshape2)
dflong<-melt(df)
To the long format new column sample
is added. Here I repeated names group1
, group2
, group3
according to number of row in original data frame and number of original columns in each group.
dflong$sample<-c(rep("group1",nrow(df)*5),rep("group2",nrow(df)*4),rep("group3",nrow(df)*4))
New column is used with argument fill=
to set colors according to grouping.
library(ggplot2)
ggplot(data=dflong,aes(variable,value, fill=sample))+geom_boxplot()
This is a follow-up to Didzis Elferts.
Objective: Split the sample into 3 colour groups with a difference in shade within the colour group.
The first part of the code is the same:
df<-as.data.frame(matrix(rnorm(1300),ncol=13))
library(reshape2)
dflong<-melt(df)
dflong$sample<-c(rep("group1",nrow(df)*5),rep("group2",nrow(df)*4),rep("group3",nrow(df)*4))
library(ggplot2)
Now, use the package RColorBrewer to select color shades
library(RColorBrewer)
Create a list of colors by color class
col.g <- c(brewer.pal(9,"Greens"))[5:9] # select 5 colors from class Greens
col.r <- c(brewer.pal(9,"Reds"))[6:9] # select 4 colors from class Reds
col.b <- c(brewer.pal(9,"Blues"))[6:9] # select 4 colors from class Blues
my.cols <- c(col.g,col.r,col.b)
Take a look at the colors selected:
image(1:13,1,as.matrix(1:13), col=my.cols, xlab="my palette", ylab="", xaxt="n", yaxt="n", bty="n")
And now plot with the colors we have created
ggplot(data=dflong,aes(variable,value,colour=variable))+geom_boxplot()+scale_colour_manual(values = my.cols)
In the above, with the colour and scale_colour_manual commands, only the lines are colored. Below, we use fill and scale_fill_manual:
ggplot(data=dflong,aes(variable,value,fill=variable))+geom_boxplot()+scale_fill_manual(values = my.cols)
P.S. I'm a total newbie and learning R myself. I saw this question as an opportunity to apply something I just learned.