I tried aggregation on large dataset using 'ffbase' package using ffdfdply
function in R.
lets say I have three variables called Date,Item and sales. Here I want to aggregate the sales over Date and Item using sum function. Could you please guide me through some proper syntax in R.
Here I tried like this:
grp_qty <- ffdfdply(x=data[c("sales","Date","Item")], split=as.character(data$sales),FUN = function(data)
summaryBy(Date+Item~sales, data=data, FUN=sum)).
I would appreciate for your solution.
Mark that ffdfdply is part of ffbase, not ff. To show an example of the usage of ffdfdply, let's generate an
ffdf
with 50Mio rows.Mark that grp_qty is an
ffdf
which resides on disk.