This question already has an answer here:
I want to add a column of means based on factor column in R
data.frame
. Like this:
df1 <- data.frame(X = rep(x = LETTERS[1:2], each = 3), Y = 1:6)
df2 <- aggregate(data = df1, Y ~ X, FUN = mean)
df3 <- merge(x = df1, y = df2, by = "X", suffixes = c(".Old",".New"))
df3
# X Y.Old Y.New
# 1 A 1 2
# 2 A 2 2
# 3 A 3 2
# 4 B 4 5
# 5 B 5 5
# 6 B 6 5
To accomplish this problem I've to create two unnecessary data.frames
. I'd like to know a way to append a column of means by factor column into my original data.frame
without creating any extra data.frames
. Thanks for your time and help.
ddply
andtransform
to the rescue (although I'm sure you'll get at least 4 different ways to do this):This is what the
ave
function is for.Joran answered beautifully, This is not an answer to your question but an extension of the conversation. If you're looking for table of means for two categorical variable's relationship to a dependent here's the Hadley function for that:
Here's a head view of CO2 data, and a look at the means table:
Two alternative ways of doing this:
1. with the dplyr package:
2. with the data.table package:
both give the following result: