I've been having this strange problem with apply
lately. Consider the following example:
set.seed(42)
df <- data.frame(cars, foo = sample(LETTERS[1:5], size = nrow(cars), replace = TRUE))
head(df)
speed dist foo
1 4 2 E
2 4 10 E
3 7 4 B
4 7 22 E
5 8 16 D
6 9 10 C
I want to use apply
to apply a function fun
(say, mean
) on each column of that data.frame
. If the data.frame
is containing only numeric
values, I do not have any problem:
apply(cars, 2, mean)
speed dist
15.40 42.98
But when trying with my data.frame
containing numeric
and character
data, it seem to fail:
apply(df, 2, mean)
speed dist foo
NA NA NA
Warning messages:
1: In mean.default(newX[, i], ...) :
argument is not numeric or logical: returning NA
2: In mean.default(newX[, i], ..) :
argument is not numeric or logical: returning NA
3: In mean.default(newX[, i], ...) :
argument is not numeric or logical: returning NA
Of course, I was expecting to get NA
for the character
column, but I would like to get values for the numeric
columns anyway.
sapply(df, class)
speed dist foo
"numeric" "numeric" "factor"
Any pointers would be appreciated as I'm feeling like I'm missing something very obvious here!
> sessionInfo()
R version 2.14.1 (2011-12-22)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8
[5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8
[7] LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base