I don't know if I am in the right section for this question, I've looked around and did not find an answer so here is my question:
I have a CSV file ordered as follows:
dat <- read.csv(text="Date,Demand
01/01/2012 00:00:00,5061.5
01/01/2012 00:05:00,5030.0
01/01/2012 00:10:00,5011.5
01/01/2012 00:15:00,4983.5
01/01/2012 00:20:00,4963.4
01/01/2012 00:25:00,4980.6
01/01/2012 00:30:00,4969.4
01/01/2012 00:35:00,4961.7
01/01/2012 00:40:00,4929.0
01/01/2012 00:45:00,4907.1
01/01/2012 00:50:00,4892.8
01/01/2012 00:55:00,4870.1
01/01/2012 01:00:00,4860.4",header=TRUE)
The date format is, I guess, %m-%d-%Y-%H-%M-%S
I'd like to summarize the demand in order to obtain an aggregation on the hour as follows:
01/01/2012 00:00:00.................59 560.6 MGW/h
#which is the sum of the 12th first date.
01/01/2012 01:00:00.................xxxxxxx MGW/h
01/01/2012 02:00:00.................xxxxxxx MGW/h
Of course my file is way larger than that, I have a total of more than 1 million lines
So, I hope I made myself understandable enough for you, maybe there is also a date format problem. If so, does someone know how to change it in the good one, I tried with as.Date
but the result is not the expected one.
Using the example data, something like this could work:
I recommend you check out
xts
package which is very good for any time series analysis.Following example will show how you can get sums over any periodicity
More generic example below showing how you can apply any function over any periodicity
In above examples
endpoints
function createINDEX
of end points of periods over which you want to apply any function.