Do I need to reshape this wide data to effectively

2020-02-12 03:50发布

问题:

I have a data.frame that looks like

  Year Crustaceans       Cod       Tuna    Herring Scorpion.fishes
1 1950    58578630   2716706   69690537   87161396        15250015
2 1951    59194582   3861166   34829755   51215349        15454659
3 1952    47562941   4396174   31061481   13962479        12541484
4 1953    68432658   3901176   23225423   13229061         9524564
5 1954    64395489   4412721   20798126   25285539         9890656
6 1955    76111004   4774045   13992697   18910756         8446391

With several more species (columns), and years running from 1950 to 2006. I'd like to explore it with ggplot2 (which I'm just learning). Do I need to transform this data so that the species is a factor to effectively use ggplot2 on this data? If not, how do I avoid having to create a layer for each species individually? If yes, (or really in either case) a quick pointer on using reshape or plyr to turn column names into a factor would be much appreciated.

回答1:

A simple transformation using melt (from the reshape/2 package) would suffice. I would do

library(reshape2)
qplot(Year, value, colour = variable, data = melt(df, 'Year'), geom = 'line')


回答2:

I found the following link to be extremely helpful to learning reshape. Reshape and plyr are very easy to use functions once you have the format (not necessarily the fastest (data.table package is written using some C so it's much faster) of how they work down. This tutorial pdf is a great resource for learning it. Also I suggest copying the line from example(cast) into a script and running them one at a time to see the result.

http://had.co.nz/stat405/lectures/19-tables.pdf



标签: r ggplot2