How to ignore cor.test:“not enough finite observat

2019-07-28 07:05发布

问题:

I have the following working-toy example:

trunctiris <- iris [1:102,] 
analysis <- trunctiris %>%
  group_by(Species) %>%
  nest() %>%
  mutate(model = map(data, ~lm(Sepal.Length ~ Sepal.Width, data = .)),
         cor = map(data, ~tidy(cor.test(.x$Sepal.Length, .x$Sepal.Width), 3)))

stats <- analysis %>%
  unnest(cor)

ggplot(trunctiris, aes(x = Sepal.Length, y = Sepal.Width)) +
  geom_point(shape = 21) +
  geom_text(data = stats, aes(label = sprintf("r = %s", round(estimate, 3)), x = 7, y = 4)) +
  geom_text(data = stats, aes(label = sprintf("p = %s", round(p.value, 3)),  x = 7, y = 3.8)) +
  geom_smooth(method = "lm", formula = y ~ x) +
  stat_poly_eq(aes(label = paste(..eq.label.., ..rr.label.., sep = "~~~~")),
               formula = y ~ x,
               parse = TRUE) +
  facet_wrap(~Species)

The code was provided in another question. However, I haven't been able to make it work with my data. The problem is that I have some (not all) groups that have a less than 3 observations, and so, in the "analysis" part R returns:

Error in mutate_impl(.data, dots) : not enough finite observations

which is in relation to the fact that there are not enough observations in the group (in this case: virginica). I want to get around this, I've tried 'try(if nrow(data) >= 2)' or similar.. like the following:

analysis <- iris %>% 
group_by(Species) %>% 
nest() %>% mutate(model = map(data, ~lm (Sepal.Length ~ Sepal.Width, data = .)), 
    cor = if_else( nrow(data) <= 2 , warning ("Must have at least 3 rows of data"), 
        (map(data, ~tidy(cor.test(.x$Sepal.Length, .x$Sepal.Width), 3)))))

which returns:

Error in mutate_impl(.data, dots) : not enough finite observations In addition: Warning message: In if_else(nrow(list(list(Sepal.Length = c(5.1, 4.9, 4.7, 4.6, 5, : Must have at least 3 rows of data

Does anyone know an easy way to get around this? I'd like to skip the problematic group and keep on going.

Many thanks and sorry for my very basic R skills.

回答1:

purrr::safely or purrr::possibly allow for easy guarding against errors when you are mapping. In this case, a good strategy is to wrap the call to tidy(cor.test(... in possibly and return an empty data.frame if an error occurs

library(purrr)
analysis <- trunctiris %>%
  group_by(Species) %>%
  nest() %>%
  mutate(
    model = map(data, ~lm(Sepal.Length ~ Sepal.Width, data = .)),
    cor = map(data, possibly(
      ~tidy(cor.test(.x$Sepal.Length, .x$Sepal.Width), 3), otherwise = data.frame())
    )
  )
# A tibble: 3 × 4
     Species              data    model                  cor
      <fctr>            <list>   <list>               <list>
1     setosa <tibble [50 × 4]> <S3: lm> <data.frame [1 × 8]>
2 versicolor <tibble [50 × 4]> <S3: lm> <data.frame [1 × 8]>
3  virginica  <tibble [2 × 4]> <S3: lm> <data.frame [0 × 0]> #<- Note the empty df here

Which becomes:

unnest(analysis)
# A tibble: 2 × 9
     Species  estimate statistic      p.value parameter  conf.low conf.high
      <fctr>     <dbl>     <dbl>        <dbl>     <int>     <dbl>     <dbl>
1     setosa 0.7425467  7.680738 6.709843e-10        48 0.5851391 0.8460314
2 versicolor 0.5259107  4.283887 8.771860e-05        48 0.2900175 0.7015599
# ... with 2 more variables: method <fctr>, alternative <fctr>

And so the group that gave an error is sucessfully removed from the end result.