Scatter plot with groups

2019-08-18 19:19发布

问题:

Hi I want to do a scatterplot or point plot of different groups. I can get it work for individual plots but as ggplot2 need coordinates for the x-axes. thats were I already get into trouble. This is my structure:

# A tibble: 2 x 33
  gene_id     gene     N1    N2    N3    N4    N5    N6    N7    T1    T2    T3    T4
  <chr>       <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 ENSMUSG000… RNS     198   182   206   183   177   194   193   173   191   167   200
2 ENSMUSG000… RNS2    199   198   216   252   273   159   164   159   162   151   199
# ... with 20 more variables: T5 <dbl>, T6 <dbl>, T7 <dbl>, T8 <dbl>, T9 <dbl>,
#   T10 <dbl>, T11 <dbl>, T12 <dbl>, T13 <dbl>, T14 <dbl>, M1 <dbl>, M2 <dbl>,
#   M3 <dbl>, M4 <dbl>, M5 <dbl>, M6 <dbl>, M7 <dbl>, M8 <dbl>, M9 <dbl>, M10 <dbl>

I would like to plot first all individuals next to each other as well as group the N, T and Ms together and plot them in different colors. I am pretty sure I can assign the colors once the plotting itself works.

ggplot(a, aes(y=a[,3],x=1))+
geom_point()

I start in this case very low in the porcess.;;;

Thanks for your help!

回答1:

You need to transform the column names into variables. I use the data that you published but I am not sure if you expect a plot like this

data <- read.csv('tabla.csv')
data

Initial data

 gene_id gene  N1  N2  N3  N4  N5  N6  N7  T1  T2  T3  T4
1 ENSMUSG000…  RNS 198 182 206 183 177 194 193 173 191 167 200
2 ENSMUSG000… RNS2 199 198 216 252 273 159 164 159 162 151 199

Melt the data

library(reshape)

newData <- melt(data, id=c("gene_id", "gene"))
newData

You get a data frame like this

gene_id gene variable value
1  ENSMUSG000…  RNS       N1   198
2  ENSMUSG000… RNS2       N1   199
3  ENSMUSG000…  RNS       N2   182
4  ENSMUSG000… RNS2       N2   198
5  ENSMUSG000…  RNS       N3   206

Create a new colum with the name of each group (N, T or M)

count = 0
for(value in newData$variable){

        # To get the group of each value

        if(startsWith(value, 'N')){

                group <- 'N'

        }else if(startsWith(value, 'T')){

                group <- 'T'

        }else if(startsWith(value, 'M')){

                group <- 'M'
        }

        # Storing the group name into a vector
        if(count == 0){

                vectorGroup <- group
        }else{

                vectorGroup <- c(vectorGroup, group)

        }

        count <- count + 1

}

newData$group <- vectorGroup

Finally making the scatter plot

library(ggplot2)

ggplot(newData, aes(x = gene, y = value, color= group)) + theme_minimal() + geom_point(width = 0.5) + scale_fill_brewer(palette = 'Paired') 

You get a plot like this