I need to execute foreach with two iteration variables of the same size (for example 1:n), but the function changes them in parallel as written here:
We call a and b the iteration variables, since those are the variables that are changing during the multiple executions. Note that we are iterating over them in parallel, that is, they are both changing at the same time.
What I need is to make foreach to change them independently, so that I would have a list with length n^2, not n.
example:
X = foreach(i=1:n, j=1:n) %do% (sum(M[i,]*M[j,]))
in the end I get a vector of length n which is only a diagonal of matrix X, not the full matrix.
P.S. I was trying to make this with for looping, but the computation time was too great to leave the code unoptimized.
To nest foreach
loops, use the nesting operator, "%:%":
library(foreach)
n <- 4
M <- matrix(rnorm(n*n), n)
X <- foreach(i=1:n, .combine='cbind') %:%
foreach(j=1:n, .combine='c') %do% {
sum(M[i,]*M[j,])
}
To avoid repeated computations, you can have j
iterate from i
to n
, and use a .final
function to pad the resulting vectors with NA
:
pad <- function(x) c(rep(NA, n - length(x)), x)
Y <- foreach(i=1:n, .combine='cbind') %:%
foreach(j=i:n, .combine='c', .final=pad) %do% {
sum(M[i,]*M[j,])
}
But these solutions are only of academic interest. For simplicity and speed, I suspect that tcrossprod
is the best solution by far:
Z <- tcrossprod(M)
For a 4000 X 4000 matrix, this executed in under 8 seconds on my Linux machine.
foreach
is not more efficient than for
. Look for the %:%
operator as commented by @BenBarnes to use it. Parallelization might help a bit, but not much.
Try the following instead of an explicit loop:
M <- matrix(1:8,4)
prodsums <- combn(seq_len(nrow(M)), 2, FUN=function(ind) {
res <- sum(M[ind[1],]*M[ind[2],])
names(res) <- paste(ind, collapse="*")
res
}, simplify=F)
unlist(prodsums)
#1*2 1*3 1*4 2*3 2*4 3*4
# 32 38 44 48 56 68
resmat <- matrix(ncol=nrow(M),nrow=nrow(M))
resmat[lower.tri(resmat)] <- unlist(prodsums)
# [,1] [,2] [,3] [,4]
# [1,] NA NA NA NA
# [2,] 32 NA NA NA
# [3,] 38 48 NA NA
# [4,] 44 56 68 NA
resmat[upper.tri(resmat)] <- t(resmat)[upper.tri(resmat)]
diag(resmat) <- rowSums(M^2)
# [,1] [,2] [,3] [,4]
#[1,] 26 32 38 44
#[2,] 32 40 48 56
#[3,] 38 48 58 68
#[4,] 44 56 68 80