-->

Dplyr Multiple Lags Tidy Eval?

2019-07-08 19:44发布

问题:

I am trying to make multiple lags using the least amount of code possible in dplyr, while sticking to tidy eval. The following Standard Evaluation (SE) code works:

#if(!require(dplyr)) install.packages("dplyr");
library(dplyr)

a=as_tibble(c(1:100))

lags=3

lag_prefix=paste0("L", 1:lags, ".y") 

multi_lag=setNames(paste("lag(.,", 1:lags, ")"), lag_prefix)

a %>% mutate_at(vars(value), funs_(multi_lag)) #final line

# A tibble: 100 x 4
value  L1.y  L2.y  L3.y
<int> <int> <int> <int>
1     1    NA    NA    NA
2     2     1    NA    NA
3     3     2     1    NA
4     4     3     2     1
5     5     4     3     2
6     6     5     4     3
7     7     6     5     4
8     8     7     6     5
9     9     8     7     6
10    10     9     8     7
# ... with 90 more rows

However, you'll notice that the final line does not use tidy eval, but resorts to SE. The package information regarding the funs_ command says it is superfluous due to tidy eval. Thus, I am wondering if it is possible to do this with tidy eval? Any help appreciated, I am a novice to evaluation types.

回答1:

From this blog post: multiple lags with tidy evaluation by Romain François

library(rlang)
library(tidyverse)

a <- as_tibble(c(1:100))
n_lags <- 3

lags <- function(var, n = 3) {
  var <- enquo(var)
  indices <- seq_len(n)

  # create a list of quosures by looping over `indices`
  # then give them names for `mutate` to use later
  map(indices, ~ quo(lag(!!var, !!.x))) %>%
    set_names(sprintf("L_%02d.%s", indices, "y"))
}

# unquote the list of quosures so that they are evaluated by `mutate`
a %>% 
  mutate_at(vars(value), funs(!!!lags(value, n_lags)))

#> # A tibble: 100 x 4
#>    value L_01.y L_02.y L_03.y
#>    <int>  <int>  <int>  <int>
#>  1     1     NA     NA     NA
#>  2     2      1     NA     NA
#>  3     3      2      1     NA
#>  4     4      3      2      1
#>  5     5      4      3      2
#>  6     6      5      4      3
#>  7     7      6      5      4
#>  8     8      7      6      5
#>  9     9      8      7      6
#> 10    10      9      8      7
#> # ... with 90 more rows

Created on 2019-02-15 by the reprex package (v0.2.1.9000)



回答2:

Inpired by the answer by @Tung I tried to make more generic function that looks more like the tidyr functions rather than dplyr functions, i.e outside mutate.

# lags function
lags <- function(data, var, nlags) {
  var <- enquos(var)

  data %>% 
    bind_cols(
      map_dfc(seq_len(n), 
              function(x) {
                new_var <- sprintf("L_%02d.%s", x, "y")
                data %>% transmute(new_var := lag(!!!var, x))
                }
                ))
}

# Apply function to data frame
a <- as_tibble(c(1:100))

a %>% 
  lags(value, 3)