Right now I'm working with some huge matrices in R and I need to be able to reassemble them using diagonal bands. For programming reasons (to avoid having to do n*n operations for a matrix of size n (millions of calculations), I wanted to just do 2n calculations (thousands of calculations) and thus chose to do run my function on the diagonal bands of the matrix. Now, I have the results, but need to take these matrix slices and assemble them in a way that allows me to use multiple processors.
Both foreach and mclapply won't let me modify objects outside of loops,so I'm trying to think of a parallel solution. If there was some function to assign an off-diagonal band to a part of a matrix that could be reliably done, I'm all for it.
input:
[1] 0.3503037
[1] 0.2851895 0.2851895
[1] 0.5233396 0.5233396 0.5233396
[1] 0.6250584 0.6250584 0.6250584 0.6250584
[1] 0.4300964 0.4300964 0.4300964 0.4300964 0.4300964
[1] 0.4300964 0.4300964 0.4300964 0.4300964 0.4300964
[1] 0.3949782 0.3949782 0.3949782 0.3949782
[1] 0.7852812 0.7852812 0.7852812
[1] 0.5309648 0.5309648
[1] 0.7718504
desired output (with parallel operations):
[,1] [,2] [,3] [,4] [,5]
[1,] 0.4300964 0.6250584 0.5233396 0.2851895 0.3503037
[2,] 0.3949782 0.4300964 0.6250584 0.5233396 0.2851895
[3,] 0.7852812 0.3949782 0.4300964 0.6250584 0.5233396
[4,] 0.5309648 0.7852812 0.3949782 0.4300964 0.6250584
[5,] 0.7718504 0.5309648 0.7852812 0.3949782 0.4300964
The more I look at this, I need a version of Matrix::bandSparse that is parallelized.
If you want to build a single matrix you are looking for shared memory parallelism. Both
parallel
andforeach
implement distributed memory parallelism. I know of one R package that implements shared memory (Rdsm
), but I have not used it. The more natural way to get shared memory parallelism is by using C++.I have implemented the bands to matrix conversion in R (serial), C++ with Rcpp (serial) and C++ plus OpenMP with Rcpp and RcppParallel (parallel). Note that the input I used was a list of vectors without duplicated diagonal. For the OpenMP solution I converted this to a (ragged)
matrix
since this allows for easy conversion to a thread safeRMatrix
. This conversion is very fast even in R:Benchmarking these solutions on a dual core machine give me:
I am surprised about the speedup from
diags2mtrOmp
.