Plot dataframe with ggplot2 - R

I am trying the create a graph using the following dataframe. It has 10 observations and 25 variables (one of them - ID - is just an ID column for the different observations

'data.frame':    10 obs. of  25 variables:
 $ NDVI_mean   : num  0.0607 0.0552 0.5811 0.7676 0.0328 ...
 $ NDVI_sd     : num  0.0881 0.0298 0.1644 0.0937 0.0292 ...
 $ NDVI_mean.1 : num  0.0211 0.0549 0.1375 0.1207 0.024 ...
 $ NDVI_sd.1   : num  0.0111 0.0195 0.0227 0.0701 0.0197 ...
 $ NDVI_mean.2 : num  0.0703 0.0715 0.6832 0.769 0.0418 ...
 $ NDVI_sd.2   : num  0.0938 0.0298 0.1601 0.0674 0.0402 ...
 $ NDVI_mean.3 : num  0.0636 0.0552 0.6829 0.732 0.0292 ...
 $ NDVI_sd.3   : num  0.0912 0.0222 0.1613 0.1102 0.0355 ...
 $ NDVI_mean.4 : num  0.092 0.0781 0.6947 0.5256 0.056 ...
 $ NDVI_sd.4   : num  0.0879 0.0211 0.158 0.0686 0.0328 ...
 $ NDVI_mean.5 : num  0.1047 0.091 0.4251 0.3573 0.0722 ...
 $ NDVI_sd.5   : num  0.0441 0.013 0.0585 0.0368 0.0156 ...
 $ NDVI_mean.6 : num  0.0547 0.0654 0.5912 0.6098 0.0404 ...
 $ NDVI_sd.6   : num  0.0874 0.0195 0.2143 0.0975 0.0287 ...
 $ NDVI_mean.7 : num  0.1047 0.0882 0.6914 0.6532 0.0689 ...
 $ NDVI_sd.7   : num  0.0843 0.0177 0.1553 0.0653 0.0299 ...
 $ NDVI_mean.8 : num  0.0859 0.071 0.6905 0.6866 0.0556 ...
 $ NDVI_sd.8   : num  0.0809 0.018 0.1624 0.0866 0.0311 ...
 $ NDVI_mean.9 : num  0.0949 0.1204 0.1434 0.2849 0.1231 ...
 $ NDVI_sd.9   : num  0.00951 0.00719 0.01228 0.03483 0.01023 ...
 $ NDVI_mean.10: num  0.0854 0.0752 0.6712 0.7326 0.0628 ...
 $ NDVI_sd.10  : num  0.0789 0.0212 0.1471 0.0951 0.0326 ...
 $ NDVI_mean.11: num  0.0942 0.0986 0.6434 0.7741 0.0899 ...
 $ NDVI_sd.11  : num  0.0735 0.0188 0.1299 0.0765 0.0277 ...
 $ ID          : int  1 2 3 4 5 6 7 8 9 10

Image of the dataframe

I would like to create a graph with the following characteristics:

The X axes should be the different NDVI_mean variables (NDVI_mean - NDVI_mean.1 - NDVI_mean.2 - etc.)

The Y axes should be the values of those variables

I would like the graph to contain 10 lines which correspond to the 10 observations

I am completely new with ggplot2. I have manage to create some graphs using this code but it is not what I want

ggplot2(NDVIdf)+geom_lines(aes(x=NDVI_mean, y=ID))

Edit

Output of dput(NDVIdf)

> dput(NDVIdf)
structure(list(NDVI_mean = c(0.0607135413903215, 0.0551773354119158, 
0.58106114381679, 0.767559372904067, 0.0327779986531678, 0.0178615320775541, 
0.242088217197272, 0.20999285277774, 0.0393074640533382, 0.362654323805063
), NDVI_sd = c(0.0881409040764672, 0.0297587817960566, 0.164350402694459, 
0.0937447350009958, 0.0291673504162979, 0.0328954778667684, 0.154674728930805, 
0.143054126101431, 0.0394783704067313, 0.0311605700206721), NDVI_mean.1 = c(0.0210982854397687, 
0.0549092171312182, 0.137518549485032, 0.120670383289592, 0.0240424367577284, 
0.0159096452554129, 0.0672761385275565, 0.0341803495552938, -0.00134448575083377, 
0.016580205828644), NDVI_sd.1 = c(0.0110723260269658, 0.01951851232517, 
0.0227065359549684, 0.0700670943356275, 0.0196608837546022, 0.0121199724795787, 
0.0585473350749026, 0.0227914450038557, 0.0135058392129466, 0.0150912377421709
), NDVI_mean.2 = c(0.0703180934388137, 0.0714783174472453, 0.683213190781725, 
0.769036956136717, 0.0418112830812162, 0.0284743048998433, 0.23348292946483, 
0.235929665998861, 0.0450296798850473, 0.193100322654342), NDVI_sd.2 = c(0.0937618859311873, 
0.0298498793436821, 0.160085159223464, 0.0673810570033997, 0.0402149180186587, 
0.0397066821267195, 0.150683537602667, 0.145471088294412, 0.0437922991992655, 
0.0141486994532874), NDVI_mean.3 = c(0.063601350404069, 0.0551904437586304, 
0.682930851671194, 0.731967082842268, 0.0291583347580934, 0.0193111448443503, 
0.319631300233593, 0.166050320085929, 0.0366014763086276, 0.221872499210234
), NDVI_sd.3 = c(0.0912293427166813, 0.0222453701937956, 0.161322107465979, 
0.110207844254988, 0.0355049011856384, 0.0349930810428516, 0.226276210965238, 
0.140438890801978, 0.0376830428032925, 0.0220584182188743), NDVI_mean.4 = c(0.0919804842383724, 
0.0781265501499422, 0.694671427954745, 0.525632176936541, 0.055980386796576, 
0.0444207693277835, 0.426953378337129, 0.160783372015251, 0.0609390942283722, 
0.280378773041507), NDVI_sd.4 = c(0.0879414801936151, 0.0210552190044792, 
0.158033594194682, 0.0685669288657517, 0.0327713833848639, 0.0354769286383367, 
0.252469606866754, 0.12982572565032, 0.036836867665617, 0.0411465416161875
), NDVI_mean.5 = c(0.104738543138272, 0.0909713368375085, 0.42508525657118, 
0.357320549164012, 0.0721572385527876, 0.0663794698314188, 0.23911562990616, 
0.142111328436142, 0.0838823297267412, 0.251654432686439), NDVI_sd.5 = c(0.0441048632295888, 
0.0130326877444498, 0.0585279766430101, 0.0368348303398042, 0.0155510862094617, 
0.0192137216652464, 0.0585475549304422, 0.0736886597614494, 0.0212806524351524, 
0.0259938407933158), NDVI_mean.6 = c(0.054670992223019, 0.065381296775636, 
0.591168574495215, 0.609806323819807, 0.0403625648315437, 0.00811946347995523, 
0.310916693477462, 0.158498269033413, 0.0443372830506701, 0.371097291211943
), NDVI_sd.6 = c(0.0874297350407085, 0.0195485598323798, 0.214314782421285, 
0.0974716364190341, 0.0286726835375469, 0.0464844141552075, 0.124111018323546, 
0.128024962302557, 0.0389891785245309, 0.0776972260415738), NDVI_mean.7 = c(0.104681150177595, 
0.0882044567680745, 0.691354972003269, 0.65319672247453, 0.0689026139683328, 
0.0564115948034715, 0.345466555679804, 0.189094867024672, 0.0757218905916805, 
0.473698360613464), NDVI_sd.7 = c(0.0842570243914433, 0.0176701260206802, 
0.15529028462675, 0.0653161775993753, 0.0298643217871498, 0.0350802264835342, 
0.179839784256719, 0.123083052319927, 0.0381348337459403, 0.0504351117371588
), NDVI_mean.8 = c(0.08591308296691, 0.0710169977816541, 0.69050244219096, 
0.686550053201553, 0.05559868259279, 0.0386856009179078, 0.385427380396624, 
0.191494000375466, 0.0582605982908748, 0.634092671211804), NDVI_sd.8 = c(0.0809198942299164, 
0.0179761950231585, 0.162375086927031, 0.0865933396475938, 0.0311069109690036, 
0.0341123644056808, 0.221678551331174, 0.123510636808576, 0.0352156193862569, 
0.037441682380018), NDVI_mean.9 = c(0.0948832729278301, 0.120400127877444, 
0.143375746425582, 0.284877572639076, 0.123096886134381, 0.111171634746743, 
0.223262233590715, 0.120538120679937, 0.0971369124338333, 0.233815280325772
), NDVI_sd.9 = c(0.00951402820331221, 0.00718755312976778, 0.0122787887859583, 
0.0348298462083254, 0.0102326571562652, 0.00707392292527096, 
0.047660749828529, 0.0127667426992843, 0.00615732059786784, 0.0341929453840942
), NDVI_mean.10 = c(0.0853653180560701, 0.0751638478379656, 0.671240397847597, 
0.732629951796317, 0.0628200256114987, 0.0389376153602489, 0.261580310922298, 
0.237551383820751, 0.0543488352606212, 0.729810364384283), NDVI_sd.10 = c(0.0788860954632659, 
0.0212295674726634, 0.147125862744127, 0.0951195938807548, 0.0325819971840338, 
0.0313179762667036, 0.149062631594425, 0.123975319460501, 0.0295479331978356, 
0.0347570926406439), NDVI_mean.11 = c(0.0941656718689444, 0.0985743153705462, 
0.643407964040386, 0.774084527469533, 0.0899420980061257, 0.0535166413991826, 
0.303595683796766, 0.245633779631581, 0.0643377575135575, 0.697236179516483
), NDVI_sd.11 = c(0.0735030069394199, 0.0187835191570716, 0.129850840148202, 
0.0764938134743573, 0.0276954256603995, 0.0260888900038652, 0.122217568202193, 
0.0934074608484564, 0.0272831553843282, 0.026010072370012), ID = 1:10), .Names = c("NDVI_mean", 
"NDVI_sd", "NDVI_mean.1", "NDVI_sd.1", "NDVI_mean.2", "NDVI_sd.2", 
"NDVI_mean.3", "NDVI_sd.3", "NDVI_mean.4", "NDVI_sd.4", "NDVI_mean.5", 
"NDVI_sd.5", "NDVI_mean.6", "NDVI_sd.6", "NDVI_mean.7", "NDVI_sd.7", 
"NDVI_mean.8", "NDVI_sd.8", "NDVI_mean.9", "NDVI_sd.9", "NDVI_mean.10", 
"NDVI_sd.10", "NDVI_mean.11", "NDVI_sd.11", "ID"), row.names = c("1", 
"2", "3", "4", "5", "6", "7", "8", "9", "10"), class = "data.frame")

标签： r dataframe ggplot2 graph

1条回答

小情绪 Triste *

2楼-- · 2019-10-03 16:42

Your data is in wide format but ggplot requires that you first convert it to long format. That is, you should have one row per observation per measurement. In your case, you should have a dataframe with 12*2*10 rows, and three columns being: observation (1-10), statistic (mean.1, sd.2, ...) and value.

You can use tidyr's gather function to easily reformat the data into long format:

library(tidyr)

NDVIdf_forplot <- gather(NDVIdf, key = statistic, value = value, -ID)
ggplot(NDVIdf_forplot, aes(x = statistic, y = value) + geom_line()

The gather function takes two arguments: key and value, that set the column names for the relevant fields in the new long format dataframe. In this case, key is the name given to the new column that encodes the previous column headings (mean, sd, ...) and value to the previous values (0.0607,...).

The expression -id tells gather not to remove the ID column, so each record in the new dataframe is still associated with the correct id. That way you can use it to separate the different lines in the ggplot call like this:

ggplot(NDVIdf_forplot, aes(x = statistic, y= value, group = ID)) + geom_line()

And, if you want to colour the different observations differently and include a legend, you can use the colour aesthetic:

ggplot(NDVIdf_forplot, aes(x = statistic, y = value, group = ID, colour = ID)) + 
    geom_line()

0人赞添加讨论(0) 举报

Plot dataframe with ggplot2 - R

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间