I have a data frame with 2 variables. the first column "X" represents date and time with format dd/mm/yyyy hh:mm, the values in the second column "Y" are the electricity meter reading which are taken each after 5 minutes. Now I want to add the values of each half an hour. For instance
X Y
13/12/2014 12:00 1
13/12/2014 12:05 2
13/12/2014 12:10 1
13/12/2014 12:15 2
13/12/2014 12:20 2
13/12/2014 12:25 1
At the end i want to present a result as:
13/12/2014 12:00 9
13/12/2014 12:30 12
and so on...
Here's an alternative approach which actually takes
X
in count (as per OP comment).First, we will make sure
X
is of properPOSIXct
format so we could manipulate it correctly (I'm using thedata.table
package here for convenience)Then, we will aggregate per cumulative minutes instances of
00
or30
withinX
while summingY
and extracting the first value ofX
per each group. I've made a more complicated data set in order illustrate more complicated scenarios (see below)Data
Some explanations
by
expression:format(X, "%M")
gets the minutes out ofX
(see?strptime
)00
or30
(using%in%
)cumsum
separates these matched values into separate groups which we aggregate by by putting this expression into theby
statement (see?data.table
)j
th epression(X = X[1L], Y = sum(Y))
is simply getting the first value ofX
per each group and the sum ofY
per each group.i
th expressionorder(X)
in order to make sure the data set is properly ordered by date (one of the main reasons I've convertedX
to properPOSIXct
format)For a better understanding on how
data.table
works, see some tutorials heret1
groups the values using integer division by 1800 (30 minutes)Considering your data frame as
df
. You can try -