I'm trying to tighten up a %>%
piped workflow where I need to apply the same function to several columns but with one argument changed each time. I feel like purrr
's map
or invoke
functions should help, but I can't wrap my head around it.
My data frame has columns for life expectancy, poverty rate, and median household income. I can pass all these column names to vars
in mutate_at
, use round
as the function to apply to each, and optionally supply a digits
argument. But I can't figure out a way, if one exists, to pass different values for digits
associated with each column. I'd like life expectancy rounded to 1 digit, poverty rounded to 2, and income rounded to 0.
I can call mutate
on each column, but given that I might have more columns all receiving the same function with only an additional argument changed, I'd like something more concise.
library(tidyverse)
df <- tibble::tribble(
~name, ~life_expectancy, ~poverty, ~household_income,
"New Haven", 78.0580437642378, 0.264221051111753, 42588.7592521085
)
In my imagination, I could do something like this:
df %>%
mutate_at(vars(life_expectancy, poverty, household_income),
round, digits = c(1, 2, 0))
But get the error
Error in mutate_impl(.data, dots) :
Column life_expectancy
must be length 1 (the number of rows), not 3
Using mutate_at
instead of mutate
just to have the same syntax as in my ideal case:
df %>%
mutate_at(vars(life_expectancy), round, digits = 1) %>%
mutate_at(vars(poverty), round, digits = 2) %>%
mutate_at(vars(household_income), round, digits = 0)
#> # A tibble: 1 x 4
#> name life_expectancy poverty household_income
#> <chr> <dbl> <dbl> <dbl>
#> 1 New Haven 78.1 0.26 42589
Mapping over the digits uses each of the digits
options for each column, not by position, giving me 3 rows each rounded to a different number of digits.
df %>%
mutate_at(vars(life_expectancy, poverty, household_income),
function(x) map(x, round, digits = c(1, 2, 0))) %>%
unnest()
#> # A tibble: 3 x 4
#> name life_expectancy poverty household_income
#> <chr> <dbl> <dbl> <dbl>
#> 1 New Haven 78.1 0.3 42589.
#> 2 New Haven 78.1 0.26 42589.
#> 3 New Haven 78 0 42589
Created on 2018-11-13 by the reprex package (v0.2.1)
2 solutions
mutate
with !!!
invoke
was a good idea but you need it less now that most tidyverse
functions support the !!!
operator, here's what you can do :
digits <- c(life_expectancy = 1, poverty = 2, household_income = 0)
df %>% mutate(!!!imap(digits, ~round(..3[[.y]], .x),.))
# # A tibble: 1 x 4
# name life_expectancy poverty household_income
# <chr> <dbl> <dbl> <dbl>
# 1 New Haven 78.1 0.26 42589
..3
is the initial data frame, passed to the function as a third argument, through the dot at the end of the call.
Written more explicitly :
df %>% mutate(!!!imap(
digits,
function(digit, name, data) round(data[[name]], digit),
data = .))
If you need to start from your old interface (though the one I propose will be more flexible), first do:
digits <- setNames(c(1, 2, 0), c("life_expectancy", "poverty", "household_income"))
mutate_at
and <<-
Here we bend a bit the good practice of avoiding <<-
whenever possible, but readability matters and this one is really easy to read.
digits <- c(1, 2, 0)
i <- 0
df %>%
mutate_at(vars(life_expectancy, poverty, household_income), ~round(., digits[i<<- i+1]))
# A tibble: 1 x 4
# name life_expectancy poverty household_income
# <chr> <dbl> <dbl> <dbl>
# 1 New Haven 78.1 0.26 42589
(or just df %>% mutate_at(names(digits), ~round(., digits[i<<- i+1]))
if you use a named vector as in my first solution)
Here's a map2
solution along the lines of Henrik's comment. You can then wrap this inside a custom function. I provided an rough first attempt but I have done minimal tests, so it probably breaks under all sorts of situations if evaluation is strange. It also doesn't use tidyselect for .at
, but neither does modify_at
...
library(tidyverse)
df <- tibble::tribble(
~name, ~life_expectancy, ~poverty, ~household_income,
"New Haven", 78.0580437642378, 0.264221051111753, 42588.7592521085,
"New York", 12.349685329, 0.324067934, 32156.230974623
)
rounded <- df %>%
select(life_expectancy, poverty, household_income) %>%
map2_dfc(
.y = c(1, 2, 0),
.f = ~ round(.x, digits = .y)
)
df %>%
select(-life_expectancy, -poverty, -household_income) %>%
bind_cols(rounded)
#> # A tibble: 2 x 4
#> name life_expectancy poverty household_income
#> <chr> <dbl> <dbl> <dbl>
#> 1 New Haven 78.1 0.26 42589
#> 2 New York 12.3 0.32 32156
modify2_at <- function(.x, .y, .at, .f) {
modified <- .x[.at] %>%
map2(.y, .f)
.x[.at] <- modified
return(.x)
}
df %>%
modify2_at(
.y = c(1, 2, 0),
.at = c("life_expectancy", "poverty", "household_income"),
.f = ~ round(.x, digits = .y)
)
#> # A tibble: 2 x 4
#> name life_expectancy poverty household_income
#> <chr> <dbl> <dbl> <dbl>
#> 1 New Haven 78.1 0.26 42589
#> 2 New York 12.3 0.32 32156
Created on 2018-11-13 by the reprex package (v0.2.1)
Fun with tidyeval:
prepared_pairs <-
map2(
set_names(syms(list("life_expectancy", "poverty", "household_income"))),
c(1, 2, 0),
~expr(round(!!.x, digits = !!.y))
)
mutate(df, !!! prepared_pairs)
# # A tibble: 1 x 4
# name life_expectancy poverty household_income
# <chr> <dbl> <dbl> <dbl>
# 1 New Haven 78.1 0.26 42589