How to assign within apply family?

2019-06-23 02:01发布

I have data.frame that contains several factors and i want to rename factor levels for all of these factors. E.g.:

mydf <- data.frame(col1 = as.factor(c("A","A",NA,NA)),col2 = as.factor(c("A",NA,NA,"A")))
mydf <- as.data.frame(lapply(mydf,addNA))

Note that the real life example has way more than just two columns. Hence I would like to use apply to assign other level names to all of these columns, just like in:

levels(mydf$col1) <- c("1","0") 

I tried the following but it did not work…

 apply(mydf,1,function(x) levels(x) <- c("1","0"))

I am not really surprised it doesn't work but I have no better ideas right now. Should I use with maybe?

EDIT: I realized I made a mistake here by oversimplifying things. I used addNA to account for the fact, that NAs should not handled as NAs anymore. Thus I also want to relabel them. This doesn't work with Andrie's suggestion and returns the following error message:

 labels = c("1",  : invalid labels; length 2 should be 1 or 1  

Note that I updated my example df.

1条回答
smile是对你的礼貌
2楼-- · 2019-06-23 02:34

You can change levels by reference using setattr() from packages bit or data.table. This avoids copying the whole dataset, and since you said you have a lot of columns ...

require(bit)          # Either package
require(data.table)   #
setattr(mydf[[1]],"levels",c("1","0"))
setattr(mydf[[2]],"levels",c("1","0"))

That can be done in a simple for loop which is very fast. It is your responsibility to ensure that you replace the levels vector with a vector of the same length, otherwise the factor may no longer be valid. And, you have to replace the whole levels vector with this method. There is an internal way in data.table to replace particular level names by reference, but probably no need to go that far.

查看更多
登录 后发表回答