I'm new to the purrr
paradigm and am struggling with it.
Following a few sources I have managed to get so far as to nest a data frame, run a linear model on the nested data, extract some coefficients from each lm, and generate a summary for each lm. The last thing I want to do is extract the "r.squared" from the summary (which I would have thought would be the simplest part of what I'm trying to achieve), but for whatever reason I can't get the syntax right.
Here's a MWE of what I have that works:
library(purrr)
library(dplyr)
library(tidyr)
mtcars %>%
nest(-cyl) %>%
mutate(fit = map(data, ~lm(mpg ~ wt, data = .)),
sum = map(fit, ~summary))
and here's my attempt to extract the r.squared which fails:
mtcars %>%
nest(-cyl) %>%
mutate(fit = map(data, ~lm(mpg ~ wt, data = .)),
sum = map(fit, ~summary),
rsq = map_dbl(sum, "r.squared"))
Error in eval(substitute(expr), envir, enclos) : `x` must be a vector (not a closure)
This is superficially similar to the example given on the RStudio site:
mtcars %>%
split(.$cyl) %>%
map(~ lm(mpg ~ wt, data = .x)) %>%
map(summary) %>%
map_dbl("r.squared")
This works however I would like the r.squared values to sit in a new column (hence the mutate statement) and I'd like to understand why my code isn't working instead of working-around the problem.
EDIT:
Here's a working solution that I came to using the solutions below:
mtcars %>%
nest(-cyl) %>%
mutate(fit = map(data, ~lm(mpg ~ wt, data = .)),
summary = map(fit, glance),
r_sq = map_dbl(summary, "r.squared"))
EDIT 2:
So, it actually turns out that the bug is from the inclusion of the tilde key in the summary = map(fit, ~summary) line. My guess is that the makes the object a function which is nest and not the object returned by the summary itself. Would love an authoritative answer on this if someone wants to chime in.
To be clear, this version of the original code works fine:
mtcars %>%
nest(-cyl) %>%
mutate(fit = map(data, ~lm(mpg ~ wt, data = .)),
summary = map(fit, summary),
r_sq = map_dbl(summary, "r.squared"))