I have a file that I would like to reshape it to use R: These are the commands that I am running.
x <- data.frame(read.table("total.txt", sep=",", header=T)
y <- melt(x, id=c("Hostname", "Date", "MetricType"))
when I issue this command to basically combine date with hour, I get an error and the window hangs.
yy <- cast(y, Hostname + Date + variable ~ MetricType)
This is the error:
Aggregation requires fun.aggregate: length used as default
ServerNa Date MetricType Hour Value
19502 server1 01/05/2012 MemoryAVG Hour5 41.830000
19503 server1 01/05/2012 CPUMaximum Hour5 9.000000
19504 server1 01/05/2012 CPUAVG+Sev Hour5 9.060000
19505 server1 01/05/2012 CPUAVG Hour5 30.460000
19506 server1 01/05/2012 61 Hour5 63.400000
19507 server1 01/05/2012 60 Hour5 59.300000
19508 server2 01/05/2012 MemoryAVG Hour5 10.690000
19509 server2 01/05/2012 CPUMaximum Hour5 1.000000
19510 server2 01/05/2012 CPUAVG+Sev Hour5 0.080000
19511 server2 01/05/2012 CPUAVG Hour5 1.350000
Is there an easy way to do this without hanging the server?
when I used library(reshape2) and this:
yy <- acast(y, Hostname + Date + variable ~ MetricType, fun.aggregate=mean)
all the values turn into NA. I have no clue what is going on?
Clarification: In the discussion below, I refer to
dcast()
rather thancast()
. As Maiasaura notes in the comments, the functioncast()
from thereshape
package has been replaced in thereshape2
package by two functions:dcast()
(for data.frame output) andacast()
(for array or matrix output). In any case, my comments about the need for afun.aggregate
argument hold equally forcast()
,dcast()
, andacast()
.The error is being thrown because for at least one combination of the categorical variables in the call to
cast()
, your data.framey
must contain at least two rows of data. As documented in?cast
(or?dcast
):Run the code below to see how this works, and how it can be remedied. In the last line of code, I use the
fun.aggregate
argument to telldcast()
to usemean()
to combine values for any repeated combination of variables. In its place, you can put whatever aggregation function best fits your own situation.