Given the dplyr workflow:
require(dplyr)
mtcars %>%
tibble::rownames_to_column(var = "model") %>%
filter(grepl(x = model, pattern = "Merc")) %>%
group_by(am) %>%
summarise(meanMPG = mean(mpg))
I'm interested in conditionally applying filter
depending on the value of applyFilter
.
Solution
For applyFilter <- 1
the rows are filtered with use of the "Merc"
string, without the filter all rows are returned.
applyFilter <- 1
mtcars %>%
tibble::rownames_to_column(var = "model") %>%
filter(model %in%
if (applyFilter) {
rownames(mtcars)[grepl(x = rownames(mtcars), pattern = "Merc")]
} else
{
rownames(mtcars)
}) %>%
group_by(am) %>%
summarise(meanMPG = mean(mpg))
Problem
The suggested solution is inefficient as the ifelse
call is always evaluated; a more desireable approach would only evaluate the filter
step for applyFilter <- 1
.
Attempt
The inefficient working solution would look like that:
mtcars %>%
tibble::rownames_to_column(var = "model") %>%
# Only apply filter step if condition is met
if (applyFilter) {
filter(grepl(x = model, pattern = "Merc"))
}
%>%
# Continue
group_by(am) %>%
summarise(meanMPG = mean(mpg))
Naturally, the syntax above is incorrect. It's only a illustration how the ideal workflow should look.
Desired answer
I'm not interested in creating an interim object; the workflow should resemble:
startingObject %>% ... conditional filter ... final object
Ideally, I would like to arrive at solution where I can control whether the
filter
call is being evaluated or not