I'm working on building a function that I will manipulate a data frame based on a string. Within the function, I'll build a column name as from the string and use it to manipulate the data frame, something like this:
library(dplyr)
orig_df <- data_frame(
id = 1:3
, amt = c(100, 200, 300)
, anyA = c(T,F,T)
, othercol = c(F,F,T)
)
summarize_my_df_broken <- function(df, my_string) {
my_column <- quo(paste0("any", my_string))
df %>%
filter(!!my_column) %>%
group_by(othercol) %>%
summarize(
n = n()
, total = sum(amt)
) %>%
# I need the original string as new column which is why I can't
# pass in just the column name
mutate(stringid = my_string)
}
summarize_my_df_works <- function(df, my_string) {
my_column <- quo(paste0("any", my_string))
df %>%
group_by(!!my_column, othercol) %>%
summarize(
n = n()
, total = sum(amt)
) %>%
mutate(stringid = my_string)
}
# throws an error:
# Argument 2 filter condition does not evaluate to a logical vector
summarize_my_df_broken(orig_df, "A")
# works just fine
summarize_my_df_works(orig_df, "A")
I understand what the problem is: unquoting the quosure as an argument to filter()
in the broken version is not referencing the actual column anyA.
What I don't understand is why it works in summarize()
, but not in filter()
--why is there a difference?
Right now you are are making quosures of strings, not symbol names. That's not how those are supposed to be used. There's a big difference between
quo("hello")
andquo(hello)
. If you want to make a proper symbol name from a string, you need to userlang::sym
. So a quick fix would beIf you look more closely I think you'll see the
group_by/summarize
isn't actually working the way you expect either (though you just don't get the same error message). These two do not produce the same resultsAgain the problem is using a string instead of a symbol.
You don't have any conditions for
filter()
in your 'broken' function, you just specify the column name.Beyond that, I'm not sure if you can insert quosures into larger expressions. For example, here you might try something like:
But I don't think that would work.
Instead, I would suggest using the conditional function
filter_at()
to target the appropriate column. In that case, you separate the quosure from the filter condition:}