My dataset has as features: players IDs, weeks and points.
I want to calculate the mean of points for previous weeks, but not all past weeks, just to the last 5 or less (if the current week is smaller than 5).
Example: For player_id = 5, week = 7, the result will be the average of POINTS for player_id = 5 and weeks 2, 3, 4, 5 and 6.
The following code already does the average for all previous week, so I need an adaptation to make it for just 5 previous week.
player_id<-c(rep(1,30),rep(2,30),rep(3,30),rep(4,30),rep(5,30))
week<-1:30
points<-round(runif(150,1,10),0)
mydata<- data.frame(player_id=player_id,week=rep(week,5),points)
mydata<-mydata %>%
group_by(player_id) %>% # the group to perform the stat on
arrange(week) %>% # order the weeks within each group
mutate(previous_mean = cummean(points) ) %>% # for each week get the
cumulative mean
mutate(previous_mean = lag(previous_mean) ) %>% # shift cumulative
mean back one week
arrange(player_id) # sort by player_id
You can use
slice
to select just the last 5 weeks for each group. Try this:The line
selects rows within the range [(last row - 4) : last row], for each group
EDIT: To avoid trouble when the current week is less than 5, use an
ifelse
statement to validate:HAVB's approach is great, but depending on what you want, here is another. This approach is adapted from this answer to a different question, but changed for your circumstances:
Then we can look at a subset to show it worked:
So we can see at each time t,
rolling_mean
for player i will be the mean of thepoints
observations for player i at times {t - 1, ..., min(1, t - 5)}.