I have the following table:
Class x2 x3 x4
A 14 45 53
A 8 18 17
A 16 49 20
B 78 21 48
B 8 18 5
I need for each "Class" (A and B) find the maximum value in column "X3", keep that row and delete other rows.
The output should be in format like:
Class x2 x3 x4
A 14 49 20
B 78 21 48
Please, ask me questions if something unclear in my problem.
Thank you!
A base R approach could be:
mydf[as.logical(with(mydf, ave(x3, Class, FUN = function(x) x == max(x)))), ]
# Class x2 x3 x4
# 3 A 16 49 20
# 4 B 78 21 48
However, note that if there are multiple values tied for max
, it would return multiple rows for that group.
Here's a possible "data.table" approach:
library(data.table)
setkey(as.data.table(mydf), Class, x3)[, tail(.SD, 1), by = Class]
# Class x2 x3 x4
# 1: A 16 49 20
# 2: B 78 21 48
Here's another dplyr
answer for the lot
library(dplyr)
df %>% group_by(Class) %>% filter(x3 == max(x3))
# Source: local data frame [2 x 4]
# Groups: Class
#
# Class x2 x3 x4
# 1 A 16 49 20
# 2 B 78 21 48
Which could also be
group_by(df, Class) %>% filter(x3 == max(x3))
One way using dplyr
would be:
library(dplyr)
foo %>%
#For each Class
group_by(Class) %>%
# Sort rows in descending way using x3: you get the max x3 value on top
# for each group
arrange(desc(x3)) %>%
# Select the first row for each Class
slice(1)
# Class x2 x3 x4
#1 A 16 49 20
#2 B 78 21 48
EDIT
Given @Ananda's tie-values consideration and his suggestion in coments,
you could do something like this as well. But, @Richard Acriven's idea is
the way to go, if there are ties.
# Data
foo2 <- structure(list(Class = structure(c(1L, 1L, 1L, 2L, 2L), .Label = c("A",
"B"), class = "factor"), x2 = c(14L, 8L, 16L, 78L, 8L), x3 = c(49L,
18L, 49L, 21L, 18L), x4 = c(53L, 17L, 20L, 48L, 5L)), .Names = c("Class",
"x2", "x3", "x4"), class = "data.frame", row.names = c(NA, -5L
))
# Class x2 x3 x4
#1 A 14 49 53
#2 A 8 18 17
#3 A 16 49 20
#4 B 78 21 48
#5 B 8 18 5
foo2 %>%
group_by(Class) %>%
mutate(Rank = dense_rank(desc(x3))) %>%
filter(Rank == 1)
# Class x2 x3 x4 Rank
#1 A 14 49 53 1
#2 A 16 49 20 1
#3 B 78 21 48 1
Try:
do.call(rbind, lapply(split(ddf, ddf$Class), function(x) tail(x[order(x$x3),],1)))
Class x2 x3 x4
A A 16 49 20
B B 78 21 48