The problem is similar to How do I do a conditional sum which only looks between certain date criteria but slightly different and the answer from that does not fit into current problem. The main difference is that the date column based on each group may not necessarily be complete (i.e., certain date may be missing)
Input:
input <- read.table(text="
2017-04-01 A 1
2017-04-02 B 2
2017-04-02 B 2
2017-04-02 C 2
2017-04-02 A 2
2017-04-03 C 3
2017-04-04 A 4
2017-04-05 B 5
2017-04-06 C 6
2017-04-07 A 7
2017-04-08 B 8
2017-04-09 C 9")
colnames(input) <- c("Date","Group","Score")
Rule: for each group at each date, looking back 3 calendar dates (include current date). calculate the sum.
Expected output:
Date Group 3DaysSumPerGroup
2017-04-01 A 1 #1 previous two dates are not available. partial is allowed
2017-04-02 A 3 #2+1 both 4-01 and 4-02 are in the range
2017-04-04 A 6 #4+2
2017-04-07 A 7 #7
2017-04-02 B 4 # 2+2 at the same day
2017-04-05 B 5
2017-04-08 B 8
2017-04-02 C 2
2017-04-03 C 5
2017-04-06 C 6
2017-04-09 C 9
I tried to use rollapply with partial=T, but result doesn't seem correct.
input %>%
group_by(Group) %>%
arrange(Date) %>% mutate("3DaysSumPerGroup"=rollapply(data=Score,width=3,align="right",FUN=sum,partial=T,fill=NA,rm.na=T))
Here's a (supposedly efficient) solution using the new non-equi joins and the
by = .EACHI
features in data.table (v1.9.8+)