I have a data frame which I then split into three (or any number) of dataframes.
What I’m trying to do is to automatically process each column in each dataframe and add lagged versions of existing variables.
For example if there were three variables in each data.frame (V1, V2, V3) I would like to automatically (without hardcoding) add V1.lag, V2.lag and V3.lag.
Here is what I have so far, but I’m stuck now.
Any help would be highly apprecaited.
dd<-data.frame(matrix(rnorm(216),72,3),c(rep("A",24),rep("B",24),rep("C",24)),c(rep("J",36),rep("K",36)));
colnames(dd) <- c("v1", "v2", "v3", "dim1", "dim2");
dd;
dds <- split(dd, dd$dim1);
dds;
# Missing step 1: Automatically create v1.lag, v2.lag, v3.lag, etc (if required)
Finally I would like to merge the three data frames into one big dataframe which will include newly created variables.
# Missing step 2: Merge data frames into single data frame
Any help would be highly appreciated.
EDIT: In comments section I asked about moving averages instead of lags. here is the solution:
ma <- function(x, f=c(1,1,1)){as.numeric(filter(x, f, sides=1)/length(f));}
foo <- function(df, f = c(1,1,1)) {
nums <- sapply(df, is.numeric); ## which are numeric vars
nams <- paste(names(df)[nums], "ma", length(f), sep = "."); ## generate new names foo.ma
df[, nams] <- lapply(which(nums), function(id, df, f) ma(df[[id]], f = f), df = df, f = f); ## apply ma to each numeric variable
df; ## return
}
Use the
plyr
package to do all of this one step:Here is one option:
Which gives:
For the last bit, the combine step, save the above:
then use
do.call()
torbind()
the individual data frames together, as in:which gives: