Suppose I have a dataframe that looks like this:
v1 v2 v3 v4 v5 v6
r1 1 2 2 4 5 9
r2 1 2 2 4 5 10
r3 1 2 2 4 5 7
r4 1 2 2 4 5 12
r5 2 2 2 4 5 9
r6 2 2 2 4 5 10
I would like to get the row with the highest value in v6 that has the value 1 in v1.
I know how to get all rows where v1 = 1 and select the first row of that, thanks to this answer to a previous question:
ddply( df , .variables = "v1" , .fun = function(x) x[1,] )
How can I change the function so that I get the row with the highest value in v6?
From the previous results, I'd use [
to subset on your first condition using logical
comparators and then do a second subset on your second condition because as @sgibb points out in the comments, the max
value of v6
might not be in a row where v1 == 1
.
# Subset to those rows where v1 == 1
tmp <- df[ df$v1 == 1 , ]
# Then select those rows where the max value of v6 appears
tmp[ tmp$v6 == max( tmp$v6 ) , ]
If you want the first occurence, use which.max()
we could also use the subset
operator like
x_sub= subset(x, state == "C" & chainlength == 5 & segment == "C2C_REG")
where x is the data frame and the other parameter is a logical expression