Why does tapply take the subset as NA and not excl

2020-02-16 04:20发布

I have a question. I want to make a barplot with the mean and errorbars, where it is grouped for two factors. To get the mean and the standard errors I used the function tapply.

However for one of the factor I want to drop one level.

So what I did was did:

dataFE <- data[-which(plant=="FS"),] # this works fine, I get exactly the data set I want without the FS level of the factor plant 

Then to get the mean and standard error I use this:

means <- with(dataFE, as.matrix(tapply(leaves, list(plant, Orchestia), mean), nrow=2)

e <- with(dataFE, as.matrix(tapply (leaves, list(plant, Orchestia), function(x) sd(x)/sqrt(length(x))), nrow=2))

And there something strange happens, it does not calculate the FS, however it puts it in a table with NA:

    row.names   no          yes
1   F           7.009022    5.307185

2   FS          NA          NA

3   S           2.837139    2.111054

This I don't want, cause if I use this in barplot2 (package gplots) then I will get an empty bar for the FS, whereas that one should not be there at all.

So does any of use have a solution or an other method to get a nice barplot :). Thanks any way!

标签: r subset tapply
1条回答
混吃等死
2楼-- · 2020-02-16 04:43

Without a sample of your data, I'll just wager a guess:

your column plant is a factor. And while you have dropped the rows that have that value, the "level" FS still exists. Use levels(data$plant) to see. You can then use droplevels to get rid of it.

dat <- data.frame(x=1:15, y=factor(letters[1:3]))

> levels(dat$y)
[1] "a" "b" "c"

dat <- dat[dat$y != 'a',]
> levels(dat$y)
[1] "a" "b" "c"
> 

> tapply(dat$x, dat$y, sum)
 a  b  c 
NA 40 45 
> 

> droplevels(dat$y)
 [1] b c b c b c b c b c
Levels: b c
> dat$y <- droplevels(dat$y)

> tapply(dat$x, dat$y, sum)
 b  c 
40 45 
> 
查看更多
登录 后发表回答