Get row with highest value from one column after c

2019-05-18 08:48发布

问题:

Suppose I have a dataframe that looks like this:

   v1 v2 v3 v4 v5 v6
r1 1  2  2  4  5  9
r2 1  2  2  4  5  10
r3 1  2  2  4  5  7
r4 1  2  2  4  5  12
r5 2  2  2  4  5  9
r6 2  2  2  4  5  10

I would like to get the row with the highest value in v6 that has the value 1 in v1. I know how to get all rows where v1 = 1 and select the first row of that, thanks to this answer to a previous question:

ddply( df , .variables = "v1" , .fun = function(x) x[1,] )

How can I change the function so that I get the row with the highest value in v6?

回答1:

From the previous results, I'd use [ to subset on your first condition using logical comparators and then do a second subset on your second condition because as @sgibb points out in the comments, the max value of v6 might not be in a row where v1 == 1.

#  Subset to those rows where v1 == 1
tmp <- df[ df$v1 == 1 , ]

#  Then select those rows where the max value of v6 appears
tmp[ tmp$v6 == max( tmp$v6 ) , ]

If you want the first occurence, use which.max()



回答2:

we could also use the subset operator like

x_sub= subset(x, state == "C" & chainlength == 5 & segment == "C2C_REG")

where x is the data frame and the other parameter is a logical expression



标签: r plyr