Dynamic Grouping in R | Grouping based on conditio

2019-07-20 22:02发布

In R, in aggregate() function, How to specify stopping condition on grouping on applied function on the variable?

For example, I have data-frame like this: "df" Input Data frame

Note: Assuming each row in input data frame is denoting single ball played by a player in that match. So, by counting a number of rows can tell us the number of balls required.

And, I want my data frame like this one: Output data frame My need is: How many balls are required to score 10 runs?

Currently, I am using this R code: group_data <- aggregate(df$score, by=list(Category=df$player,df$match), FUN=sum,na.rm = TRUE)

Using this code, I can not stop grouping as I want, it stops when it groups all rows. I don't want all rows to consider.

But How to put constraint like "Stop grouping as soon as score >= 10" By putting this constraint, my sole purpose is to count the number of rows satisfying this condition.

Thanks in advance.

标签: r aggregate
1条回答
迷人小祖宗
2楼-- · 2019-07-20 22:46

Here is one option using dplyr

library(dplyr)
df1 %>%
    group_by(match, player) %>% 
    filter(!lag(cumsum(score) > 10, default = FALSE)) %>% 
    summarise(score = sum(score), Count = n())
# A tibble: 2 x 4
# Groups:   match [?]
#   match player score Count
#   <int>  <int> <dbl> <int>
#1     1     30    12     2
#2     2     31    15     3

data

df1 <- structure(list(match = c(1L, 1L, 1L, 2L, 2L, 2L), player = c(30L, 
30L, 30L, 31L, 31L, 31L), score = c(6, 6, 6, 3, 6, 6)), .Names = c("match", 
 "player", "score"), row.names = c(NA, -6L), class = "data.frame")
查看更多
登录 后发表回答