I am attempting to row bind many data frames together into a single massive data frame. The data frames are named sequentially with the first named df1
, the second named df2
, the third named df3
, etc. Currently, I have bound these data frames together by explicitly typing the names of the data frames; however, for a very large number of data frames (roughly 10,000 total data frames are expected) this is suboptimal.
Here is a working example:
# Load required packages
library(plyr)
# Generate 100 example data frames
for(i in 1:100){
assign(paste0('df', i), data.frame(x = rep(1:100),
y = seq(from = 1,
to = 1000,
length = 100)))
}
}
# Create a master merged data frame
df <- rbind.fill(df1, df2, df3, df4, df5, df6, df7, df8, df9, df10,
df11, df12, df13, df14, df15, df16, df17, df18, df19, df20,
df21, df22, df23, df24, df25, df26, df27, df28, df29, df30,
df31, df32, df33, df34, df35, df36, df37, df38, df39, df40,
df41, df42, df43, df44, df45, df46, df47, df48, df49, df50,
df51, df52, df53, df54, df55, df56, df57, df58, df59, df60,
df61, df62, df63, df64, df65, df66, df67, df68, df69, df70,
df71, df72, df73, df74, df75, df76, df77, df78, df79, df80,
df81, df82, df83, df84, df85, df86, df87, df88, df89, df90,
df91, df92, df93, df94, df95, df96, df97, df98, df99, df100)
Any thoughts on how to optimize this would be greatly appreciated.
We can use
bind_rows
fromdplyr
do.call
comes handy. The function you specify works on a list of arguments.Or with
data.table::rbindlist
. Setfill
to true to take care of the missing values, if any.