I have a working example of a weighted sum of transactions over the last 12 hours. Now, I've added an account column and would like to compute this weighted sum separately by group. The code will run as written below. Uncomment the line starting with # account
to add the account
column to df
. How can I modify the second to last line of code such that it computes rollapplyr
separately on each account
?
library(zoo)
library(tidyverse)
Create the example data:
set.seed(123)
randomDates <- function(N, st="2017-01-01 00:00:00", et="2017-02-01 23:59:59") {
st <- as.POSIXct(st, tz = "UTC")
et <- as.POSIXct(et, tz = "UTC")
dt <- as.numeric(difftime(et,st,units="sec", tz="UTC"))
ev <- sort(runif(N, 0, dt))
rt <- st + ev
rt
}
df <- data.frame(date = randomDates(100) ,
data = round( abs(rnorm(100)) * 100 ) # ,
# account = sample(c("A", "B", "C"), 100, replace=TRUE )
)
df <- df %>% arrange(date)
Define the helper functions:
tau <- 0.00005
decay = function(tau, day){
exp(-tau * day)
}
weighted <- function(x, tau) {
tx <- as.numeric(time(x))
seconds <- tail(tx, 1) - tx
w <- (seconds < 43200) * decay(tau, seconds) # 12 hours in seconds
sum(w * coredata(x))
}
Compute the rolling sum:
# Would like to modify this block to group by account
newData <- df %>%
read.zoo %>%
rollapplyr(43200, weighted, tau = tau, partial = TRUE, coredata = FALSE)
dfNew <- df %>% mutate( weighted_sum = newData )
date data weighted_sum
1 2017-01-01 00:21:26 38 38.000000
2 2017-01-01 21:29:53 56 56.000000
3 2017-01-02 14:02:43 34 34.000000
4 2017-01-02 20:41:28 9 19.279179
5 2017-01-03 06:08:07 160 161.644215
I haven't yet found the answer based on my research:
Apply a rolling sum by group in R
use rollapply and zoo to calculate rolling average of a column of variables
https://www.rdocumentation.org/packages/zoo/versions/1.8-1/topics/rollapply
I've also tried this solution based on feedback to this question and a linked, possible duplicate answer. However, applying the same pattern results in an error that I haven't been able to resolve:
newData <- df %>%
group_by(account) %>%
mutate(weighted_sum = rollapplyr(., width=43200, FUN = weighted,
tau = tau, partial = TRUE, coredata = FALSE) ) %>%
ungroup()
Throws this error:
# Error in mutate_impl(.data, dots) :
Evaluation error: non-numeric argument to binary operator.