I have a data frame that looks like this:
Timedate TotalSolar_MW
20 2013-06-01 04:45:00 13.0
21 2013-06-01 05:00:00 41.7
22 2013-06-01 05:15:00 81.8
23 2013-06-01 05:30:00 153.0
24 2013-06-01 05:45:00 270.7
25 2013-06-01 06:00:00 429.3
26 2013-06-01 06:15:00 535.4
"Timedate" is POSIXlt
, and "Total_Solar" is numeric
. The time steps are in 15 minute intervals from June 1, 0:00 to June 24, 24:00.
Now I want to aggregate
the quarter hourly data to hourly steps e.g. 2013-06-01 06:00:00 934.8MW (81.8MW + 153.0MW + 270.7MW + 429.3MW; from 05:15 to 06:00)
I tried this with:
Sum <-aggregate(Total_Solar_Gesamt$TotalSolar_MW,
list(as.POSIXlt(Total_Solar_Gesamt$Timedate)$hour), FUN=sum)
But it returns the aggregated hourly data of the whole data frame and gives me a new data.frame with 24 rows and the summed up MW for every hour.
How can I change the structure, only to reduce from a quarter hourly to a hourly interval? I tried a for loop but this also didn't work. Also subset
didn't work for me.
Thanks for the help!
When working with time series, I suggest you work with xts
package for this, and for example hourly.apply
:
library(xts)
dat.xts <- xts(Total_Solar_Gesamt$TotalSolar_MW,
as.POSIXct(otal_Solar_Gesamt$Timedate))
hourly.apply(dat.xts,sum)
More general you can use period.apply
which is (lapply
equivalent) , for example to aggregate your data each 2 hours you can do the following:
ends <- endpoints(zoo.data,'hours',2)
period.apply(dat.xts,ends ,sum)
Total_Solar_Gesamt <- read.table(header=TRUE, sep=",", text="
Timedate, TotalSolar_MW
2013-06-01 04:45:00, 13.0
2013-06-01 05:00:00, 41.7
2013-06-01 05:15:00, 81.8
2013-06-01 05:30:00, 153.0
2013-06-01 05:45:00, 270.7
2013-06-01 06:00:00, 429.3
2013-06-01 06:15:00, 535.4
")
Use cut.POSIXt
to divide the dates into hourly intervals:
Sum <- aggregate(Total_Solar_Gesamt["TotalSolar_MW"],
list(hour=cut(as.POSIXct(Total_Solar_Gesamt$Timedate), "hour")),
sum)
Sum
hour TotalSolar_MW
1 2013-06-01 04:00:00 13.0
2 2013-06-01 05:00:00 547.2
3 2013-06-01 06:00:00 964.7
Note that the above will group 06:00:00 with the other 06 times. If you want to group the top of the hour with the previous hour, just subtract one second from each timestamp:
Sum2 <- aggregate(Total_Solar_Gesamt["TotalSolar_MW"],
list(hour=cut(as.POSIXct(Total_Solar_Gesamt$Timedate)-1, "hour")),
sum)
Sum2
hour TotalSolar_MW
1 2013-06-01 04:00:00 54.7
2 2013-06-01 05:00:00 934.8
3 2013-06-01 06:00:00 535.4
And, if you want to report your dates one hour forward, like in your question:
Sum2$adjustedHour <- as.POSIXct(Sum2$hour) + 3600
Sum2
hour TotalSolar_MW adjustedHour
1 2013-06-01 04:00:00 54.7 2013-06-01 05:00:00
2 2013-06-01 05:00:00 934.8 2013-06-01 06:00:00
3 2013-06-01 06:00:00 535.4 2013-06-01 07:00:00
Using xts:
library(xts)
data.xts <- xts(Total_Solar_Gesamt$TotalSolar_MW,
as.POSIXct(Total_Solar_Gesamt$Timedate)-1)
# subtract 1 second, as discussed above
Sum.xts <- period.apply(data.xts, INDEX=endpoints(data.xts, "hours"), FUN=sum)
Sum.xts
[,1]
2013-06-01 04:59:59 54.7
2013-06-01 05:59:59 934.8
2013-06-01 06:14:59 535.4
Note how with xts, the timestamps in Sum.xts
are the last timestamps of each hour. xts makes it easy to align them though:
Sum.xts <- align.time(Sum.xts, 3600) # round up to next hour
Sum.xts
[,1]
2013-06-01 05:00:00 54.7
2013-06-01 06:00:00 934.8
2013-06-01 07:00:00 535.4