How to use acast (reshape2) within a function in R

2019-03-20 04:49发布

问题:

I tried to use acast from reshape2 within a self written function, but had the problem that acast did not find the data I send to it.

Here is my data:

library("reshape2")
x <- data.frame(1:3, rnorm(3), rnorm(3), rnorm(3))    
colnames(x) <- c("id", "var1", "var2", "var3")
y <-melt(x, id = "id", measure = c("var1", "var2", "var3"))

y then looks like this:

  id variable      value
1  1     var1  0.1560812
2  2     var1  1.0343844
3  3     var1 -1.4157728
4  1     var2  0.8808935
5  2     var2  0.1719239
6  3     var2  0.6723758
7  1     var3 -0.7589631
8  2     var3  1.1325995
9  3     var3 -1.5744876

now I can cast it back via acast:

> acast(y,y[,1] ~ y[,2])
        var1      var2       var3
1  0.1560812 0.8808935 -0.7589631
2  1.0343844 0.1719239  1.1325995
3 -1.4157728 0.6723758 -1.5744876

However, when writing a small wrapper for acast that should do the same, i get a stupid error messages:

wrap.acast <- function(dat, v1 = 1, v2 = 2) {
    out <- acast(dat, dat[,v1] ~ dat[,v2])
    return(out)
}

wrap.acast(y)

Error in eval(expr, envir, enclos) : object 'dat' not found

The problem is obviously related to something like environments and global/local variables. As it gives other error messages after declaring dat in the global environment (i.e., v1 and v2 not found as long as they aren't global).

I would like to use resahpe (especially acast) within a function without the necessity of declaring the variables outside the function. What is the trick?

Thanks.

回答1:

Instead of using the formula specification, use the character specification:

acast(y, list(names(y)[1], names(y)[2]))


回答2:

One issue is that you are abusing the formula notation in R. You shouldn't do things like

> acast(y, y[,1] ~ y[,2])
        var1       var2         var3
1  2.1726117  0.6107264  0.291446236
2  0.4755095 -0.9340976 -0.443291873
3 -0.7099464 -1.2536334  0.001105352

as the 'y' bits are redundant if a data object is supplied. If you refer to the variables of y by name directly in the formula, things work nicely

> acast(y, id ~ variable)
        var1       var2         var3
1  2.1726117  0.6107264  0.291446236
2  0.4755095 -0.9340976 -0.443291873
3 -0.7099464 -1.2536334  0.001105352

and the code is much more readable in this second version.

To do what you want using the acast wrapper is going to involve generating the correct formula using the names, as Joris points out, and Hadley's solution is much simpler. So my point really is to watch out with how you use formula specification in R. You'll save yourself a lot of trouble in the long run (though not specifically with this particular problem) if you use formulas properly.



回答3:

Correction : problem is not that it doesn't find dat, but that it doesn't find dat[,v1] and dat[,v2] in the specified formula. Acast takes an argument of the type formula, and that one gets evaluated in a temporary environment created around your data frame. Within that environment, it doesn't find a "dat" object when the function is wrapped within another.

I'm not completely following as to how this does work in the global and doesn't when wrapped, but if you feed acast a formula, it does work within a function as well.

wrap.acast <- function(dat, v1 = 1, v2 = 2) {
    x1 <- names(dat)[v1]
    x2 <- names(dat)[v2]
    form <- as.formula(paste(x1,"~",x2))
    out <- acast(dat,form)
    return(out)
}

using your toy data :

> wrap.acast(y)
        var1      var2       var3
1 0.04095337 0.4044572 -0.4532233
2 1.23905358 1.2493187  0.7083557
3 0.72798307 0.7868746  1.7144811


回答4:

I found a pretty inelegant way to solve the problem using super assignments (<<-).
Changing the function to the following does the job. But, it is pretty ugly as it creates global variables which remain.

wrap.acast <- function(dat, v1 = 1, v2 = 2) {
    dat <<- dat
    v1 <<- v1
    v2 <<- v2
    out <- acast(dat, dat[,v1] ~ dat[,v2])
    return(out)
}

I am still very interested in other (less clogging) solutions.

prior to running the function:

> ls()
[1] "wrap.acast" "x"          "y"     

after running the function:

> ls()
[1] "dat"        "v1"         "v2"         "wrap.acast" "x"         
[6] "y"