My dataset is something like this
Section Time x
s3 9:35 2
s4 9:35 2
s1 9:36 1
s2 10:01 1
s8 11:00 2
So, I want to group the data section wise on hourly interval and sum up the x values that lies in that interval
My expected output is
sec Time x
s1 9:00-10:00 1
s2 9:00-10:00 0
s3 9:00-10:00 2
s4 9:00-10:00 2
s8 9:00-10:00 0
s1 10.00-11.00 0
s2 10.00-11.00 1
s3 10.00-11.00 0
s4 10.00-11.00 0
s8 10.00-11.00 1
I tried to get some help from this post in stack overflow, but I am getting the following error for my this query. Here x is my frame
data.frame(value = tapply(cbind(x$x),
list(sec= x$section,cut(x$Time, breaks="1 hour")),
sum))
Error in cut.default(x$Time, breaks = "1 hour") : 'x' must be numeric
I am not even sure if that is right or wrong. I never worked with time data in R. So any help on how can I achieve that would be a great help.
I think the problem lies in the fact that your
Time
column is in a character format ?Anyway, here is a quick and dirty approach using dplyr :
Here is an alternative version:
This gives the following data frame:
The trick here is, like in @Tutuchan's suggestion, to ignore that the times are actually times like in a POSIXct object, but to treat them instead simply as charachter strings. I hope this helps.
Update / Edit
As I mentioned previously in a comment, my former version of the code did not perform the requested sum of x over equal Sections falling into the same time frame. This is corrected in the updated version posted above, but I decided to give up trying to do all this in base R. Eventually, I used the
plyr
package.Another options is using the class
POSIXct
, then in the functioncut
applied to date-time objects specify "hour" in the argumentbreaks
. See?cut.POSIXt
:Output: