Plot multiple variables on y-axis using ggplot

2019-07-16 01:36发布

问题:

I'm a beginner in R and I have a problem and need your help please.

I have a text file containing data like this:

   A             C             G    class     phylum       order
-0.000187   -0.219166   1.693306 Chordata   Monotremata   Mammalia  
0.015664    -0.264506   1.482692 Chordata   Batidoidimorpha   Chondrichthyes    
-0.404323   0.219374    2.230190 Platyhelminthes   Cyclophyllidea   Cestoda 

but of course it has a lot of rows. I want to plot this data in such a way that all the classes are plotted on the x-axis, each one of them has the A, C and G value plotted as geom_point, and that these points are connected using a line with a specific color depending on A,C or G. I managed to do this by using the plot and par functions, but now I want to do it using the ggplot library.

Thanks in advance.

回答1:

The specifics of your question are a bit unclear, but the general approach to plotting multiple variables in one plot with ggplot graphics is to melt() the data.frame() first. I didn't follow how the points and lines are supposed to fit into your graph, but here's an approach that uses the colour parameter to plot the columns A, C, and G by class on the x-axis:

library(ggplot2)
library(reshape2)

df <- data.frame(a = rnorm(10), c = rnorm(10), g = rnorm(10), class = sample(letters[20:23], 10, TRUE))
df.m <- melt(df)
ggplot(df.m, aes(class, value, colour = variable)) +
  geom_point()


回答2:

I had a similar issue I wanted to plot.

The answer is, you need a NEW column, which can be set as the group interaction. Here, I created a column called V1, which designates which letter, each letter belongs too, then use aes(group=interaction(variable.factor, new.factor). In this example case, the column 'V1' is arbitrary.

  class variable       value V1
1      u        a  0.77041380  a
2      v        a  0.09461429  a
3      t        a  0.22704242  a
4      w        a -0.21501380  a
5      w        a -0.48246983  a
6      v        a  1.69609897  a
7      w        a -0.38847860  a
8      t        a  2.45669883  a
9      t        a  0.24774451  a
10     u        a  0.04195110  a
11     u        c  0.57444553  c
12     v        c  0.73172047  c
13     t        c -1.59409421  c
14     w        c -0.12679464  c
15     w        c  0.19424856  c
16     v        c -1.28742724  c
17     w        c -1.12103626  c
18     t        c -0.57090558  c
19     t        c  0.53798077  c
20     u        c -0.47777022  c
21     u        g -0.91249913  g
22     v        g -1.49256508  g
23     t        g -1.77449710  g
24     w        g  0.71426647  g
25     w        g  0.79678361  g
26     v        g -1.28814106  g
27     w        g -1.04701972  g
28     t        g  0.07309817  g
29     t        g  2.03606615  g
30     u        g  1.76030312  g

slightly modiftiyng the ggplot code above:

ggplot(df.m, aes(class, value, colour = variable, group = interaction(V1, variable))) +
  geom_point()+
  geom_line()

This results in a line which connects each letter, across each class (from the above answer). (sorry low rep, please follow link)

letters connected by line

If you want to separate the lines, use position=position_dodge()

ggplot(df.m, aes(class, value, colour = variable, group = interaction(V1, variable))) +
  geom_point(position = position_dodge(width = 0.2))+
  geom_line(position = position_dodge(width = 0.2))

letters connected by line with jitter

The take away here is you need a factor for group=interaction() which is separate from, and spans across your x axis.



标签: r ggplot2