Error when subsetting based on adjusted values of

I am asking a side-question about the method I learned here from @redmode :

Subsetting based on values of a different data frame in R

When I try to dynamically adjust the level I want to subset by:

N <- nrow(A)
cond <- sapply(3:N, function(i) sum(A[i,] > 0.95*B[i,])==2)
rbind(A[1:2,], subset(A[3:N,], cond))

I get an error

Error in FUN(left, right) : non-numeric argument to binary operator.

Can you think of a way I can get rows pertaining to values in A that are greater than 95% of the value in B? Thank you.

Here is code for A and B to play with.

A <- structure(list(name1 = c("trt", "0", "1", "10", "1", "1", "10"
), name2 = c("ctrl", "3", "1", "1", "1", "1", "10")), .Names = c("name1", 
"name2"), row.names = c("cond", "hour", "A", "B", "C", "D", "E"
), class = "data.frame")
B <- structure(list(name1 = c("trt", "0", "1", "1", "1", "1", "9.4"), 
    name2 = c("ctrl", "3", "1", "10", "1", "1", "9.4")), .Names = c("name1", 
"name2"), row.names = c("cond", "hour", "A", "B", "C", "D", "E"
), class = "data.frame")

标签： r formatting conditional dataframe subset

1条回答

看我几分像从前

2楼-- · 2019-09-21 03:21

You have some serious formatting issues with your data.

First, columns should be of the same data type, rows should be observations. (not always true, but a very good way to start) Here you have a row called cond, then a row called hour, then a series of classifications I'm guessing. The way you're data is presented to begin with doesn't make much sense and doesn't lend itself to easy manipulation of your data. But all is not lost. This is what I would do:

Reorganize my data:

C <- data.frame(matrix(as.numeric(unlist(A)), ncol=2)[-(1:2), ])

colnames(C) <- c('A.trt', 'A.cntr')
rownames(C) <- LETTERS[1:nrow(C)]

D <- data.frame(matrix(as.numeric(unlist(B)), ncol=2)[-(1:2), ])

colnames(D) <- c('B.trt', 'B.cntr')

(df <- cbind(C, D))

Which gives:

#   A.trt A.cntr B.trt B.cntr
# A     1      1   1.0    1.0
# B    10      1   1.0   10.0
# C     1      1   1.0    1.0
# D     1      1   1.0    1.0
# E    10     10   9.4    9.4

Then you're problem is easily solved by:

df[which(df[, 1] > 0.95*df[, 3] & df[, 2] > 0.95*df[, 4]), ]

#   A.trt A.cntr B.trt B.cntr
# A     1      1   1.0    1.0
# C     1      1   1.0    1.0
# D     1      1   1.0    1.0
# E    10     10   9.4    9.4

0人赞添加讨论(0) 举报

Error when subsetting based on adjusted values of

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间