I am trying to combine the third column of several data frames, which are called and renamed in a nested for loop, within the same looping process.
# Sample Data
ecvec_msa6_1998=matrix( round(rnorm(200, 5,15)), ncol=4)
ecvec_msa6_1999=matrix( round(rnorm(200, 4,16)), ncol=4)
ecvec_msa6_2000=matrix( round(rnorm(200, 3,17)), ncol=4)
datasets=c("msa")
num_industrys=c(6)
years=c(1998, 1999, 2000)
alist=list()
for (d in 1:length(datasets)) {
dataset=datasets[d]
for (n in 1:length(num_industrys)){
num_industry=num_industrys[n]
for (y in 1:length(years)) {
year=years[y]
eval(parse(text=paste0("newly_added = ecvec_", dataset, num_industry, "_", year)))
# renaming the old data frames
alist = list(alist, newly_added) # combining them in a list
extracted_cols <- lapply(alist, function(x) x[3]) # selecting the third column
result <- do.call("cbind", extracted_cols) # trying to cbind the third colum
}
}
}
Can somebody show me the right way to do this?
Your code almost works - here are a few changes...
alist=list()
for (d in 1:length(datasets)) {
dataset=datasets[d]
for (n in 1:length(num_industrys)){
num_industry=num_industrys[n]
for (y in 1:length(years)) {
year=years[y]
eval(parse(text=paste0("newly_added = ecvec_", dataset, num_industry, "_", year)))
#the next line produces the sort of list you want - yours was too nested
alist = c(alist, list(newly_added))
}
}
}
#once you have your list, these commands should be outside the loop
extracted_cols <- lapply(alist, function(x) x[,3]) #note the added comma!
result <- do.call(cbind, extracted_cols) #no quotes needed around cbind
head(result)
[,1] [,2] [,3]
[1,] 11 13 24
[2,] -26 -3 7
[3,] -1 -26 -14
[4,] 5 14 -15
[5,] 28 3 8
[6,] 9 -9 19
HOWEVER - a much more R-like (and faster) way of doing this would be to replace all of the above with
df <- expand.grid(datasets,num_industrys,years) #generate all combinations
datanames <- paste0("ecvec_",df$Var1,df$Var2,"_",df$Var3) #paste them into a vector of names
result <- sapply(datanames,function(x) get(x)[,3])
sapply
automatically simplifies the list into a dataframe if it can (lapply
always produces a list)
Often it's recommended to avoid nested loops in R:
See Circle 2 of R's Inferno or here.
Maybe you should try to replace this part
extracted_cols <- lapply(alist, function(x) x[3]) # selecting the third column
result <- do.call("cbind", extracted_cols) # trying to cbind the third colum
with a list like Patrick Burns has done it in the first link (p. 14). It could be also much cleaner.
Are you simply looking to extract and combine the third columns from each dataframe into a new one?
newdata <- cbind(ecvec_msa6_1998[,3],ecvec_msa6_1999[,3],ecvec_msa6_2000[,3])