how to assign colour to subset of variables ggplot

2019-08-13 00:39发布

I have a data frame of 379838 rows and 13 variables in columns( 13 clinical samples) :

 >  str( df)
'data.frame':   379838 obs. of  13 variables:
  $ V1 : num  0.8146 0.7433 0.0174 0.177 0 ...
 $ V2 : num  0.7465 0.5833 0.0848 0.5899 0.0161 ...
 $ V3 : num  0.788 0.843 0.333 0.801 0.156 ...
 $ V4 : num  0.601 0.958 0.319 0.807 0.429 ...
 $ V5 : num  0.792 0.49 0.341 0.865 1 ...
 $ V6 : num  0.676 0.801 0.229 0.822 0.282 ...
 $ V7 : num  0.783 0.732 0.223 0.653 0.507 ...
 $ V8 : num  0.69 0.773 0.108 0.69 0.16 ...
 $ V9 : num  0.4014 0.5959 0.0551 0.7578 0.2784 ...
 $ V10: num  0.703 0.784 0.131 0.698 0.204 ...
 $ V11: num  0.6731 0.8224 0.125 0.6021 0.0772 ...
 $ V12: num  0.7889 0.7907 0.0881 0.7175 0.2392 ...
 $ V13: num  0.6731 0.8221 0.0341 0.4059 0 ...

and I am trying to make a ggplot2 box plot grouping variables into three groups: V1-V5 , V6-V9 and V10-V13 and assigning different color to variables of each group.

I am trying the following code:

    df1= as.vector(df[, c("V1", "V2", "V3","V4", "V5")])
    df2= as.vector(df[, c("V6","V7", "V8","V9")])
    df3=as.vector(df[, c( "V10","V11", "V12","V13")])
    sample= c(df1,df2,df3)

   library(reshape2)

  meltData1 <- melt(df, varnames="sample")

  str(meltData1)
 'data.frame':  4937894 obs. of  2 variables:
  $ variable: Factor w/ 13 levels "V1","V2","V3",..: 1 1 1 1 1 1 1 1 1 1 ...
  $ value   : num  0.8146 0.7433 0.0174 0.177 0 ...

   p=ggplot(data=meltData1,aes(variable,value, fill=x$sample))
   p+geom_boxplot()

That gives me white box plots. How can I assign a colour to three groups of variables? Many thanks in advance!

标签: r ggplot2
2条回答
贪生不怕死
2楼-- · 2019-08-13 01:03

This is a follow-up to Didzis Elferts.

Objective: Split the sample into 3 colour groups with a difference in shade within the colour group.

The first part of the code is the same:

df<-as.data.frame(matrix(rnorm(1300),ncol=13))
library(reshape2)
dflong<-melt(df)
dflong$sample<-c(rep("group1",nrow(df)*5),rep("group2",nrow(df)*4),rep("group3",nrow(df)*4))
library(ggplot2)

Now, use the package RColorBrewer to select color shades

library(RColorBrewer)

Create a list of colors by color class

col.g <- c(brewer.pal(9,"Greens"))[5:9] # select 5 colors from class Greens
col.r <- c(brewer.pal(9,"Reds"))[6:9] # select 4 colors from class Reds
col.b <- c(brewer.pal(9,"Blues"))[6:9] # select 4 colors from class Blues
my.cols <- c(col.g,col.r,col.b)

Take a look at the colors selected:

image(1:13,1,as.matrix(1:13), col=my.cols, xlab="my palette", ylab="", xaxt="n", yaxt="n", bty="n")

And now plot with the colors we have created

ggplot(data=dflong,aes(variable,value,colour=variable))+geom_boxplot()+scale_colour_manual(values = my.cols)

In the above, with the colour and scale_colour_manual commands, only the lines are colored. Below, we use fill and scale_fill_manual:

   ggplot(data=dflong,aes(variable,value,fill=variable))+geom_boxplot()+scale_fill_manual(values = my.cols)

Here's an example of what I'm looking for

P.S. I'm a total newbie and learning R myself. I saw this question as an opportunity to apply something I just learned.

查看更多
疯言疯语
3楼-- · 2019-08-13 01:05

As sample data were not provided, made new data frame containing 13 columns with names from V1 to V13.

df<-as.data.frame(matrix(rnorm(1300),ncol=13))

With function melt() from library reshape2 data are transformed from wide to long format. Now data frame has two columns: variable and value.

library(reshape2)
dflong<-melt(df)

To the long format new column sample is added. Here I repeated names group1, group2, group3 according to number of row in original data frame and number of original columns in each group.

dflong$sample<-c(rep("group1",nrow(df)*5),rep("group2",nrow(df)*4),rep("group3",nrow(df)*4))

New column is used with argument fill= to set colors according to grouping.

library(ggplot2)
ggplot(data=dflong,aes(variable,value, fill=sample))+geom_boxplot()

enter image description here

查看更多
登录 后发表回答