Here's some code that generates a list of data.frame
s and then converts that original list into a new list with each list element a list of the rows of each data frame.
Eg.
- l1
has length 10 and each element is a data.frame
with 1000 rows.
- l2
is a list of length 1000 (nrow(l1[[k]])
) and each element is a list
of length 10 (length(l1)
) containing row-vectors from the elements of l1
l1 <- vector("list", length= 10)
set.seed(65L)
for (i in 1:10) {
l1[[i]] <- data.frame(matrix(rnorm(10000),ncol=10))
}
l2 <- vector(mode="list", length= nrow(l1[[1]]))
for (i in 1:nrow(l1[[1]])) {
l2[[i]] <- lapply(l1, function(l) return(unlist(l[i,])))
}
Edit To clarify how l1
relates to l2
, here is language agnostic code.
for (j in 1:length(l1) {
for (i in 1:nrow(l1[[1]]) { # where nrow(l1[[1]]) == nrow(l1[[k]]) k= 2,...,10
l2[[i]][[j]] <- l1[[j]][i,]
}
}
How do I speed the creation of l2
up via vectorization or parallelization? The problem I'm having is that parallel::parLapplyLB
splits lists; however, I don't want to split the list l1
, what I want to do is split the rows within each element of l1
. An intermediate solution would vectorize my current approach by using some *apply
function to replace the for-loop. This could obviously be extended to a parallel solution as well.
If I solve this on my own before an acceptable solution, I'll post my answer here.