Using geom_line with multiple groupings

2020-05-22 01:26发布

问题:

I have a table as follows:

> testsizes
    size value replicate lane
361   16  6898         1   L1
362   17 10707         1   L1
363   18  1786         1   L1
364   19  1721         1   L1
365   20  2454         1   L1
421   16  8486         2   L1
422   17 26691         2   L1
423   18  3241         2   L1
424   19  5068         2   L1
425   20  7579         2   L1
481   16  4152         3   L1
482   17  4452         3   L1
483   18   899         3   L1
484   19  1973         3   L1
485   20  2595         3   L1
571   16  8284         1   L2
572   17  9045         1   L2
573   18  5041         1   L2
574   19  7160         1   L2
575   20  9730         1   L2
631   16  5639         2   L2
632   17  9773         2   L2
633   18  2433         2   L2
634   19  3017         2   L2
635   20  3864         2   L2
691   16 10161         3   L2
692   17 18609         3   L2
693   18  3760         3   L2
694   19  3543         3   L2
695   20  4257         3   L2

> dput(testsizes)
structure(list(size = c(16L, 17L, 18L, 19L, 20L, 16L, 17L, 18L, 
19L, 20L, 16L, 17L, 18L, 19L, 20L, 16L, 17L, 18L, 19L, 20L, 16L, 
17L, 18L, 19L, 20L, 16L, 17L, 18L, 19L, 20L), value = c(6898L, 
10707L, 1786L, 1721L, 2454L, 8486L, 26691L, 3241L, 5068L, 7579L, 
4152L, 4452L, 899L, 1973L, 2595L, 8284L, 9045L, 5041L, 7160L, 
9730L, 5639L, 9773L, 2433L, 3017L, 3864L, 10161L, 18609L, 3760L, 
3543L, 4257L), replicate = c("1", "1", "1", "1", "1", "2", "2", 
"2", "2", "2", "3", "3", "3", "3", "3", "1", "1", "1", "1", "1", 
"2", "2", "2", "2", "2", "3", "3", "3", "3", "3"), lane = c("L1", 
"L1", "L1", "L1", "L1", "L1", "L1", "L1", "L1", "L1", "L1", "L1", 
"L1", "L1", "L1", "L2", "L2", "L2", "L2", "L2", "L2", "L2", "L2", 
"L2", "L2", "L2", "L2", "L2", "L2", "L2")), .Names = c("size", 
"value", "replicate", "lane"), row.names = c(361L, 362L, 363L, 
364L, 365L, 421L, 422L, 423L, 424L, 425L, 481L, 482L, 483L, 484L, 
485L, 571L, 572L, 573L, 574L, 575L, 631L, 632L, 633L, 634L, 635L, 
691L, 692L, 693L, 694L, 695L), class = "data.frame")

I want to make a line plot using ggplot that shows the change in value across sizes. At the moment I have this, amongst the other combinations I have tried:

ggplot(testlengths, aes(size, value, group=replicate, colour=replicate)) 
    + geom_line()

It looks like its trying to incorporate both lanes into the same series. But I can't find a way to set the lanes as another factor to group on. I want the lines to be grouped based on both the replicate and lane categories. The lanes should be coloured but the replicates do not need to be distinguished between each other.

I am aware that I can probably achieve this by concatenating the two groups into one group beforehand. However, before I go down that route, I am wandering if ggplot can group by more than one grouping in a line plot without facets (I need to use facets later for another grouping)? I feel like it should be able to.

回答1:

Based off my comment about interacting the two:

ggplot(testsizes, aes(x = size, y = value,
                      group = interaction(replicate, lane),
                      colour = lane)) +
 geom_line()

Which gives:



回答2:

As @joran pointed out, if ggplot could do this itself it would simply be concatenating the two groups anyway. So concatenating the two groups myself is the right way to go and gives me the desired result:

> testlengths$replane <- paste(testlengths$replicate, testlengths$lane, sep="_")

> testlengths
    size value replicate lane replane
361   16  6898         1   L1    1_L1
362   17 10707         1   L1    1_L1
363   18  1786         1   L1    1_L1
364   19  1721         1   L1    1_L1
365   20  2454         1   L1    1_L1
421   16  8486         2   L1    2_L1
422   17 26691         2   L1    2_L1
423   18  3241         2   L1    2_L1
424   19  5068         2   L1    2_L1
425   20  7579         2   L1    2_L1
481   16  4152         3   L1    3_L1
482   17  4452         3   L1    3_L1
483   18   899         3   L1    3_L1
484   19  1973         3   L1    3_L1
485   20  2595         3   L1    3_L1
571   16  8284         1   L2    1_L2
572   17  9045         1   L2    1_L2
573   18  5041         1   L2    1_L2
574   19  7160         1   L2    1_L2
575   20  9730         1   L2    1_L2
631   16  5639         2   L2    2_L2
632   17  9773         2   L2    2_L2
633   18  2433         2   L2    2_L2
634   19  3017         2   L2    2_L2
635   20  3864         2   L2    2_L2
691   16 10161         3   L2    3_L2
692   17 18609         3   L2    3_L2
693   18  3760         3   L2    3_L2
694   19  3543         3   L2    3_L2
695   20  4257         3   L2    3_L2

> ggplot(testlengths, aes(size, value, group=replane, colour=lane)) 
    + geom_line()

I guess the moral here is to do as much preprocessing of your table as you can before giving it to ggplot.