My df
looks like this:
Id Task Type Freq
3 1 A 2
3 1 B 3
3 2 A 3
3 2 B 0
4 1 A 3
4 1 B 3
4 2 A 1
4 2 B 3
I want to restructure by Id and get:
Id A B … Z
3 5 3
4 4 6
I tried:
df_wide <- dcast(df, Id + Task ~ Type, value.var="Freq")
and got the error:
Aggregation function missing: defaulting to length
I can't figure out what to put in the fun.aggregate
. What's the problem?
The reason why you are getting this warning is in the description of
fun.aggregate
(see?dcast
):So, an aggregation function is needed when there is more than one value for one spot in the wide dataframe.
An explanation based on your data:
When you use
dcast(df, Id + Task ~ Type, value.var="Freq")
you get:Which is logical because for each combination of
Id
,Task
andType
there is only value inFreq
. But when you usedcast(df, Id ~ Type, value.var="Freq")
you get this (including a warning message):Now, looking back at the top part of your data:
You see why this is the case. For each combination of
Id
andType
there are two values inFreq
(for Id 3:2
and3
forA
&3
and0
for TypeB
) while you can only put one value in this spot in the wide dataframe for each values oftype
. Thereforedcast
wants to aggregate these values into one value. The default aggregation function islength
, but you can use other aggregation functions likesum
,mean
,sd
or a custom function by specifying them withfun.aggregate
.For example, with
fun.aggregate = sum
you get:Now there is no warning because
dcast
is being told what to do when there is more than one value: return the sum of the values.