How to correctly use dplyr verbs inside a function

2019-05-17 08:30发布

问题:

I want to use filter and summarise from dplyr inside my function. Without a function it works like following:

library(dplyr)
> Orange %>% 
+     filter(Tree==1) %>% 
+     summarise(age_max = max(age))
  age_max
1    1582  

I want to do the same inside a function, but following fails:

## Function definition:

df.maker <- function(df, plant, Age){

  require(dplyr)

  dfo <- df %>% 
    filter(plant==1) %>% 
    summarise(age_max = max(Age))

  return(dfo)
}

## Use:
> df.maker(Orange, Tree, age)

 Rerun with Debug
 Error in as.lazy_dots(list(...)) : object 'Tree' not found

I know that similar questions have been asked before. I've also gone through some relevant links such as page1 and page2. But I can't fully grasp the concepts of NSE and SE. I tried following:

df.maker <- function(df, plant, Age){

  require(dplyr)

  dfo <- df %>% 
    filter_(plant==1) %>% 
    summarise_(age_max = ~max(Age))

  return(dfo)
} 

But get the same error. Please help me understand what's going on. And how can I correctly create my function? Thanks!

EDIT:
I also tried following:

df.maker <- function(df, plant, Age){

  require(dplyr)

  dfo <- df %>% 
    #filter_(plant==1) %>% 
    summarise_(age_max = lazyeval::interp(~max(x),
                                          x = as.name(Age)))

  return(dfo)
}  

> df.maker(Orange, Tree, age)
 Error in as.name(Age) : object 'age' not found 

回答1:

Either supply character arguments and use as.name:

df.maker1 <- function(d, plant, Age){
  require(dplyr)
  dfo <- d %>% 
    filter_(lazyeval::interp(~x == 1, x = as.name(plant))) %>% 
    summarise_(age_max = lazyeval::interp(~max(x), x = as.name(Age)))
  return(dfo)
}  
df.maker1(Orange, 'Tree', 'age')
  age_max
1    1582

Or capture the arguments with substitute:

df.maker2 <- function(d, plant, Age){
  require(dplyr)
  plant <- substitute(plant)
  Age <- substitute(Age)

  dfo <- d %>% 
    filter_(lazyeval::interp(~x == 1, x = plant)) %>% 
    summarise_(age_max = lazyeval::interp(~max(x), x = Age))
  return(dfo)
}  
df.maker2(Orange, Tree, age)
  age_max
1    1582


标签: r dplyr nse