Compute a rolling weighted sum by group

2020-02-15 15:36发布

问题:

I have a working example of a weighted sum of transactions over the last 12 hours. Now, I've added an account column and would like to compute this weighted sum separately by group. The code will run as written below. Uncomment the line starting with # account to add the account column to df. How can I modify the second to last line of code such that it computes rollapplyr separately on each account?

library(zoo)
library(tidyverse)

Create the example data:

set.seed(123)
randomDates <- function(N, st="2017-01-01 00:00:00", et="2017-02-01 23:59:59") {
     st <- as.POSIXct(st, tz = "UTC")

     et <- as.POSIXct(et, tz = "UTC")

     dt <- as.numeric(difftime(et,st,units="sec", tz="UTC"))
     ev <- sort(runif(N, 0, dt))
     rt <- st + ev
     rt
}

df <- data.frame(date = randomDates(100) , 
                 data = round( abs(rnorm(100)) * 100 ) # ,
                 # account = sample(c("A", "B", "C"), 100, replace=TRUE  )
)

df <- df %>% arrange(date)

Define the helper functions:

tau <- 0.00005
decay = function(tau, day){
  exp(-tau * day)
}

weighted <- function(x, tau) {
  tx <- as.numeric(time(x))
  seconds <- tail(tx, 1) - tx
  w <- (seconds < 43200) * decay(tau, seconds) # 12 hours in seconds
  sum(w * coredata(x))
}

Compute the rolling sum:

# Would like to modify this block to group by account
newData <- df %>%
  read.zoo  %>% 
  rollapplyr(43200, weighted, tau = tau, partial = TRUE, coredata = FALSE)

dfNew <- df %>% mutate( weighted_sum = newData  )

                   date data weighted_sum
1   2017-01-01 00:21:26   38    38.000000
2   2017-01-01 21:29:53   56    56.000000
3   2017-01-02 14:02:43   34    34.000000
4   2017-01-02 20:41:28    9    19.279179
5   2017-01-03 06:08:07  160   161.644215

I haven't yet found the answer based on my research:

Apply a rolling sum by group in R

use rollapply and zoo to calculate rolling average of a column of variables

https://www.rdocumentation.org/packages/zoo/versions/1.8-1/topics/rollapply

I've also tried this solution based on feedback to this question and a linked, possible duplicate answer. However, applying the same pattern results in an error that I haven't been able to resolve:

newData <- df %>%
  group_by(account) %>% 
  mutate(weighted_sum =  rollapplyr(., width=43200, FUN = weighted, 
                         tau = tau, partial = TRUE, coredata = FALSE)   ) %>% 
  ungroup()

Throws this error:

# Error in mutate_impl(.data, dots) : 
  Evaluation error: non-numeric argument to binary operator.
标签: r dplyr zoo