Using broom and tidyverse to summarise r squared g

2019-05-21 09:03发布

问题:

I posted a question here and was able to reproduce Claus' answer to calculate multiple r-squared values for each species in an additive model using tidyverse on iris data. However, an update occurred for packages and now R-sq values are not being calculated. Not sure why...
Here are clause response and output

library(tidyverse)
library(broom)
iris %>% nest(-Species) %>% 
  mutate(fit = map(data, ~mgcv::gam(Sepal.Width ~ s(Sepal.Length, bs = "cs"), data = .)),
         results = map(fit, glance),
         R.square = map(fit, ~ summary(.)$r.sq)) %>%
  unnest(results) %>%
  select(-data, -fit)

#      Species  R.square       df    logLik      AIC      BIC deviance df.residual
# 1     setosa 0.5363514 2.546009 -1.922197 10.93641 17.71646 3.161460    47.45399
# 2 versicolor 0.2680611 2.563623 -3.879391 14.88603 21.69976 3.418909    47.43638
# 3  virginica 0.1910916 2.278569 -7.895997 22.34913 28.61783 4.014793    47.72143

Yet my code and output produces this with the R.square <dbl [1]> values

library(tidyverse)
library(broom)
iris %>% nest(-Species) %>% 
  mutate(fit = map(data, ~mgcv::gam(Sepal.Width ~ s(Sepal.Length, bs = "cs"), data = .)),
          results = map(fit, glance),
          R.square = map(fit, ~ summary(.)$r.sq)) %>%
   unnest(results) %>%
   select(-data, -fit)

     Species  R.square       df    logLik      AIC      BIC deviance
      <fctr>    <list>    <dbl>     <dbl>    <dbl>    <dbl>    <dbl>
1     setosa <dbl [1]> 2.396547 -1.973593 10.74028 17.23456 3.167966
2 versicolor <dbl [1]> 2.317501 -4.021222 14.67745 21.02058 3.438361
3  virginica <dbl [1]> 2.278569 -7.895997 22.34913 28.61783 4.014793

Can anyone provide insight as to why?

回答1:

I have the same sessionInfo as the OP (see comments above). I can fix this by forcing R-squared to be a a double using map_dbl. I'm not totally sure why it works for Akrun as is...?

iris %>% nest(-Species) %>% 
  mutate(fit = map(data, ~mgcv::gam(Sepal.Width ~ s(Sepal.Length, bs = "cs"), data = .)),
         results = map(fit, glance),
         R.square = map_dbl(fit, ~ summary(.)$r.sq)) %>%
  unnest(results) %>%
  select(-data, -fit)

# A tibble: 3 x 8
  Species    R.square    df logLik   AIC   BIC deviance df.residual
  <fct>         <dbl> <dbl>  <dbl> <dbl> <dbl>    <dbl>       <dbl>
1 setosa        0.536  2.55  -1.92  10.9  17.7     3.16        47.5
2 versicolor    0.268  2.56  -3.88  14.9  21.7     3.42        47.4
3 virginica     0.191  2.28  -7.90  22.3  28.6     4.01        47.7