I'm trying to make a function using ddply inside of it. However I can't get to work. This is a dummy example reproducing what I get. Does this have anything to do this bug?
library(ggplot2)
data(diamonds)
foo <- function(data, fac1, fac2, bar) {
res <- ddply(data, .(fac1, fac2), mean(bar))
res
}
foo(diamonds, "color", "cut", "price")
I don't believe this is a bug. ddply
expects the name of a function, which you haven't really supplied with mean(bar)
. You need to write a complete function that calculates the mean you'd like:
foo <- function(data, fac1, fac2, bar) {
res <- ddply(data, c(fac1, fac2), function(x,ind){
mean(x[,ind]},bar)
res
}
Also, you shouldn't pass strings to .()
, so I changed that to c()
, so that you can pass the function arguments directly to ddply
.
There are quite a few things wrong with your code, but the main issue is: you are passing column names as character strings.
Just doing a 'find-and-replace' with your parameters within the function yields:
res <- ddply(diamonds, .("color", "cut"), mean("price"))
If you understand how ddply
works (I kind of doubt this, given the rest of the code), you will understand that this is not supposed to work: ignoring the error in the last part (the function), this should be (notice the lack of quotes: the .() notation is nothing more than plyr's way of providing the quotes):
res <- ddply(diamonds, .(color, cut), mean(price))
Fortunately, ddply
also supports passing its second argument as a vector of characters, i.e. the names of the columns, so (once again disregarding issues with the last parameter), this should become:
foo <- function(data, facs, bar) {
res <- ddply(data, facs, mean(bar))
res
}
foo(diamonds, c("color", "cut"), "price")
Finally: the function you pass to ddply
should be a function that takes as its first argument a data.frame, which will each time hold the part of you passed along data.frame (diamonds) for the current values of color
and cut
. mean("price")
or mean(price)
are neither. If you insist on using ddply
, here's what you need to do:
foo <- function(data, facs, bar) {
res <- ddply(data, facs, function(dfr, colnm){mean(dfr[,colnm])}, bar)
res
}
foo(diamonds, c("color", "cut"), "price")