Unexpected behavior using -which() in R when the s

2019-02-24 18:52发布

问题:

I have been using the R which function to remove rows from a data frame. I recently discovered that if the search term is NOT in the data.frame, the result is an empty character.

# 1: returns A-Q, S-Z (as expected)
LETTERS[-which(LETTERS == "R")]
# 2: returns "character(0)" (not what I would expect)
LETTERS[-which(LETTERS == "1")]
# 3: returns A-Z (expected)
LETTERS[which(LETTERS != "1")]
# 4: returns A-Q, S-Z (expected)
LETTERS[which(LETTERS != "R")]

Is the second example the expected behavior for -which() when the search term is not found? I have already switched my code to use the syntax in example 4, which seems safer, but I am just curious.

回答1:

That is a well-known pitfall. When nothing matches the logical test the which-function returns numeric(0) and then "[" returns nothing instead of returning everything which would be expected. You can use:

 LETTERS[ ! LETTERS == "1" ]
 LETTERS[ ! LETTERS %in% "1" ]

There is another gotcha to be aware of and is the one that makes me choose to use which(). When using logical indexing an NA value used inside "[" will return a row. I generally do not want that so I use DFRM[ which(logical) ] although this seems to bother some people who say is is not needed. I just think they are working with small datasets and infrequently encounter the annoyance of seeing tens of thousands of NA-induced useless lines of output on their console. I never use the negated which version though.



回答2:

Because of this:

which(LETTERS == '-1')
## integer(0)

and this:

(1:2)[integer(0)]
integer(0)

Instead of #4, use this:

LETTERS[LETTERS != "R"]


回答3:

In example 2, which returns integer(0) (a zero-length integer vector) because no values are TRUE. A negative zero-length vector (-integer(0)) is still a zero-length vector. So you're essentially asking for the NULL element of LETTERS, which doesn't exist.



标签: r which