How to select_if in dplyr, where the logical condi

2020-07-06 06:20发布

I want to select all numeric columns from a dataframe, and then to select all the non-numeric columns. An obvious way to do this is the following :-

mtcars %>%
    select_if(is.numeric) %>%
    head()

This works exactly as I expect.

mtcars %>%
    select_if(!is.numeric) %>%
    head()

This doesn't, and produces the error message Error in !is.numeric : invalid argument type

Looking at another way to do the same thing :-

mtcars %>%
    select_if(sapply(., is.numeric)) %>%
    head()

works perfectly, but

mtcars %>%
    select_if(sapply(., !is.numeric)) %>%
    head()

fails with the same error message. (purrr::keep behaves exactly the same way).

In both cases using - to drop the undesired columns fails too, with the same error as above for the is.numeric version, and this error message for the sapply version Error: Can't convert an integer vector to function.

The help page for is.numeric says

is.numeric is an internal generic primitive function: you can write methods to handle specific classes of objects, see InternalMethods. ... Methods for is.numeric should only return true if the base type of the class is double or integer and values can reasonably be regarded as numeric (e.g., arithmetic on them makes sense, and comparison should be done via the base type).

The help page for ! says

Value

For !, a logical or raw vector(for raw x) of the same length as x: names, dims and dimnames are copied from x, and all other attributes (including class) if no coercion is done.

Looking at the useful question Negation ! in a dplyr pipeline %>% I can see some of the reasons why this doesn't work, but neither of the solutions suggested there works.

mtcars %>%
    select_if(not(is.numeric())) %>%
    head()

gives the reasonable error Error in is.numeric() : 0 arguments passed to 'is.numeric' which requires 1.

mtcars %>%
    select_if(not(is.numeric(.))) %>%
    head()

Fails with this error :- Error in tbl_if_vars(.tbl, .predicate, caller_env(), .include_group_vars = TRUE) : length(.p) == length(tibble_vars) is not TRUE.

This behaviour definitely violates the principle of least surprise. It's not of great consequence to me now, but it suggests I am failing to understand some more fundamental point.

Any thoughts?

3条回答
虎瘦雄心在
2楼-- · 2020-07-06 07:03
mtcars %>%
  select_if(funs(!is.numeric(.))) %>%
  head()

does the same

查看更多
可以哭但决不认输i
3楼-- · 2020-07-06 07:07

Negating a predicate function can be done with the dedicated Negate() or purrr::negate() functions (rather than the ! operator, that negates a vector):

library(dplyr)

mtcars %>% 
  mutate(foo = "bar") %>% 
  select_if(Negate(is.numeric)) %>% 
  head()

#   foo
# 1 bar
# 2 bar
# 3 bar
# 4 bar
# 5 bar
# 6 bar

Or (purrr::negate() (lower-case) has slightly different behavior, see the respective help pages):

library(purrr)
library(dplyr)

mtcars %>% 
  mutate(foo = "bar") %>% 
  select_if(negate(is.numeric)) %>% 
  head()

#   foo
# 1 bar
# 2 bar
# 3 bar
# 4 bar
# 5 bar
# 6 bar
查看更多
闹够了就滚
4楼-- · 2020-07-06 07:13

you could define your own "is not numeric" function and then use that instead

is_not_num <- function(x) !is.numeric(x)

mtcars %>%
select_if(is_not_num) %>%
head()
查看更多
登录 后发表回答