Conditional calculating Maximum value in the colum

2019-04-14 23:26发布

问题:

I have the following table:

Class  x2  x3  x4
  A    14  45  53 
  A     8  18  17
  A    16  49  20
  B    78  21  48 
  B     8  18   5

I need for each "Class" (A and B) find the maximum value in column "X3", keep that row and delete other rows.

The output should be in format like:

Class  x2  x3  x4
  A    14  49  20
  B    78  21  48 

Please, ask me questions if something unclear in my problem.

Thank you!

回答1:

A base R approach could be:

mydf[as.logical(with(mydf, ave(x3, Class, FUN = function(x) x == max(x)))), ]
#   Class x2 x3 x4
# 3     A 16 49 20
# 4     B 78 21 48

However, note that if there are multiple values tied for max, it would return multiple rows for that group.


Here's a possible "data.table" approach:

library(data.table)
setkey(as.data.table(mydf), Class, x3)[, tail(.SD, 1), by = Class]
#    Class x2 x3 x4
# 1:     A 16 49 20
# 2:     B 78 21 48


回答2:

Here's another dplyr answer for the lot

library(dplyr)
df %>% group_by(Class) %>% filter(x3 == max(x3))
# Source: local data frame [2 x 4]
# Groups: Class
# 
#   Class x2 x3 x4
# 1     A 16 49 20
# 2     B 78 21 48

Which could also be

group_by(df, Class) %>% filter(x3 == max(x3))


回答3:

One way using dplyr would be:

library(dplyr)

foo %>%
    #For each Class
    group_by(Class) %>%
    # Sort rows in descending way using x3: you get the max x3 value on top
    # for each group
    arrange(desc(x3)) %>%
    # Select the first row for each Class
    slice(1)

#  Class x2 x3 x4
#1     A 16 49 20
#2     B 78 21 48

EDIT Given @Ananda's tie-values consideration and his suggestion in coments, you could do something like this as well. But, @Richard Acriven's idea is the way to go, if there are ties.

# Data
foo2 <- structure(list(Class = structure(c(1L, 1L, 1L, 2L, 2L), .Label = c("A", 
"B"), class = "factor"), x2 = c(14L, 8L, 16L, 78L, 8L), x3 = c(49L, 
18L, 49L, 21L, 18L), x4 = c(53L, 17L, 20L, 48L, 5L)), .Names = c("Class", 
"x2", "x3", "x4"), class = "data.frame", row.names = c(NA, -5L
))

#  Class x2 x3 x4
#1     A 14 49 53
#2     A  8 18 17
#3     A 16 49 20
#4     B 78 21 48
#5     B  8 18  5  

foo2 %>% 
   group_by(Class) %>% 
   mutate(Rank = dense_rank(desc(x3))) %>%
   filter(Rank == 1)

#  Class x2 x3 x4 Rank
#1     A 14 49 53    1
#2     A 16 49 20    1
#3     B 78 21 48    1


回答4:

Try:

do.call(rbind, lapply(split(ddf, ddf$Class), function(x) tail(x[order(x$x3),],1)))
  Class x2 x3 x4
A     A 16 49 20
B     B 78 21 48