Error when subsetting based on adjusted values of

2019-09-21 02:56发布

I am asking a side-question about the method I learned here from @redmode :

Subsetting based on values of a different data frame in R

When I try to dynamically adjust the level I want to subset by:

N <- nrow(A)
cond <- sapply(3:N, function(i) sum(A[i,] > 0.95*B[i,])==2)
rbind(A[1:2,], subset(A[3:N,], cond))

I get an error

Error in FUN(left, right) : non-numeric argument to binary operator. 

Can you think of a way I can get rows pertaining to values in A that are greater than 95% of the value in B? Thank you.

Here is code for A and B to play with.

A <- structure(list(name1 = c("trt", "0", "1", "10", "1", "1", "10"
), name2 = c("ctrl", "3", "1", "1", "1", "1", "10")), .Names = c("name1", 
"name2"), row.names = c("cond", "hour", "A", "B", "C", "D", "E"
), class = "data.frame")
B <- structure(list(name1 = c("trt", "0", "1", "1", "1", "1", "9.4"), 
    name2 = c("ctrl", "3", "1", "10", "1", "1", "9.4")), .Names = c("name1", 
"name2"), row.names = c("cond", "hour", "A", "B", "C", "D", "E"
), class = "data.frame")

1条回答
看我几分像从前
2楼-- · 2019-09-21 03:21

You have some serious formatting issues with your data.

First, columns should be of the same data type, rows should be observations. (not always true, but a very good way to start) Here you have a row called cond, then a row called hour, then a series of classifications I'm guessing. The way you're data is presented to begin with doesn't make much sense and doesn't lend itself to easy manipulation of your data. But all is not lost. This is what I would do:

Reorganize my data:

C <- data.frame(matrix(as.numeric(unlist(A)), ncol=2)[-(1:2), ])

colnames(C) <- c('A.trt', 'A.cntr')
rownames(C) <- LETTERS[1:nrow(C)]

D <- data.frame(matrix(as.numeric(unlist(B)), ncol=2)[-(1:2), ])

colnames(D) <- c('B.trt', 'B.cntr')

(df <- cbind(C, D))

Which gives:

#   A.trt A.cntr B.trt B.cntr
# A     1      1   1.0    1.0
# B    10      1   1.0   10.0
# C     1      1   1.0    1.0
# D     1      1   1.0    1.0
# E    10     10   9.4    9.4

Then you're problem is easily solved by:

df[which(df[, 1] > 0.95*df[, 3] & df[, 2] > 0.95*df[, 4]), ]

#   A.trt A.cntr B.trt B.cntr
# A     1      1   1.0    1.0
# C     1      1   1.0    1.0
# D     1      1   1.0    1.0
# E    10     10   9.4    9.4
查看更多
登录 后发表回答