How to select the row with the maximum value in ea

Currently I have a problem as follows. In a dataset where multiple observations for each subject exist, and I want to make a subset of this dataset where only the maximum data for a record is selected. For example, for a data set as below:

ID <- c(1,1,1,2,2,2,2,3,3)
Value <- c(2,3,5,2,5,8,17,3,5)
Event <- c(1,1,2,1,2,1,2,2,2)

group <- data.frame(Subject=ID, pt=Value, Event=Event)

Subject 1, 2 and 3 have the biggest pt value of 5, 17 and 5 respectively. How could I first, find the biggest pt value for each subject, and then, put this observation in another data frame? This means that this subset would only have the biggest pt values for each subject.

标签： r

8条回答

怪性笑人.

2楼-- · 2018-12-31 06:17

do.call(rbind, lapply(split(group,as.factor(group$Subject)), function(x) {return(x[which.max(x$pt),])}))

Using Base R

0人赞添加讨论(0) 举报

春风洒进眼中

3楼-- · 2018-12-31 06:19

Here's a data.table solution:

require(data.table) ## 1.9.2
group <- as.data.table(group)

If you want to keep all the entries corresponding to max values of pt within each group:

group[group[, .I[pt == max(pt)], by=Subject]$V1]
#    Subject pt Event
# 1:       1  5     2
# 2:       2 17     2
# 3:       3  5     2

If you'd like just the first max value of pt:

group[group[, .I[which.max(pt)], by=Subject]$V1]
#    Subject pt Event
# 1:       1  5     2
# 2:       2 17     2
# 3:       3  5     2

In this case, it doesn't make a difference, as there aren't multiple maximum values within any group in your data.

0人赞添加讨论(0) 举报

人气声优

4楼-- · 2018-12-31 06:19

If you want the biggest pt value for a subject, you could simply use:

   pt_max = as.data.frame(aggregate(pt~Subject, group, max))

0人赞添加讨论(0) 举报

不流泪的眼

5楼-- · 2018-12-31 06:24

A dplyr solution:

library(dplyr)
ID <- c(1,1,1,2,2,2,2,3,3)
Value <- c(2,3,5,2,5,8,17,3,5)
Event <- c(1,1,2,1,2,1,2,2,2)
group <- data.frame(Subject=ID, pt=Value, Event=Event)

group %>%
    group_by(Subject) %>%
    summarize(max.pt = max(pt))

This yields the following data frame:

  Subject max.pt
1       1      5
2       2     17
3       3      5

0人赞添加讨论(0) 举报

高级女魔头

6楼-- · 2018-12-31 06:27

A shorter solution using data.table:

setDT(group)[, .SD[which.max(pt)], by=Subject]
#    Subject pt Event
# 1:       1  5     2
# 2:       2 17     2
# 3:       3  5     2

0人赞添加讨论(0) 举报

深知你不懂我心

7楼-- · 2018-12-31 06:29

The most intuitive method is to use group_by and top_n function in dplyr

    group %>% group_by(Subject) %>% top_n(1, pt)

The result you get is

    Source: local data frame [3 x 3]
    Groups: Subject [3]

      Subject    pt Event
        (dbl) (dbl) (dbl)
    1       1     5     2
    2       2    17     2
    3       3     5     2

0人赞添加讨论(0) 举报

1 2 下一页

How to select the row with the maximum value in ea

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间