I'm hoping to vectorize the following loop:

for (i in 1:n) {
    for (j in 1:m) {
        temp_mat[i,j]=min(temp_mat[i,j],1);
    }
}

I thought I could do temp_mat=min(temp_mat,1), but this is not giving me the desired result. Is there a way to vectorize this loop to make it much faster?

标签： r matrix vectorization

2条回答

Luminary・发光体

2楼-- · 2019-02-26 07:32

The question is slightly confusing because min() is vectorized. In order to obtain the desired result in this specific case, it is however not necessary to use this function. Logical subsetting provides a probably more efficient (and certainly more compact) alternative.

If I understood your desired output correctly, a modification of the matrix as performed by the nested loops in your code can be achieved with a single command:

temp_mat[temp_mat > 1] <- 1

Hope this helps.

set.seed(123)
temp_mat <- matrix(2*runif(50),10)
#> temp_mat
#           [,1]       [,2]      [,3]       [,4]      [,5]
# [1,] 0.5751550 1.91366669 1.7790786 1.92604847 0.2856000
# [2,] 1.5766103 0.90666831 1.3856068 1.80459809 0.8290927
# [3,] 0.8179538 1.35514127 1.2810136 1.38141056 0.8274487
# [4,] 1.7660348 1.14526680 1.9885396 1.59093484 0.7376909
# [5,] 1.8809346 0.20584937 1.3114116 0.04922737 0.3048895
# [6,] 0.0911130 1.79964994 1.4170609 0.95559194 0.2776121
# [7,] 1.0562110 0.49217547 1.0881320 1.51691908 0.4660682
# [8,] 1.7848381 0.08411907 1.1882840 0.43281587 0.9319249
# [9,] 1.1028700 0.65584144 0.5783195 0.63636202 0.5319453
#[10,] 0.9132295 1.90900730 0.2942273 0.46325157 1.7156554
temp_mat[temp_mat > 1] <- 1
#> temp_mat
#           [,1]       [,2]      [,3]       [,4]      [,5]
# [1,] 0.5751550 1.00000000 1.0000000 1.00000000 0.2856000
# [2,] 1.0000000 0.90666831 1.0000000 1.00000000 0.8290927
# [3,] 0.8179538 1.00000000 1.0000000 1.00000000 0.8274487
# [4,] 1.0000000 1.00000000 1.0000000 1.00000000 0.7376909
# [5,] 1.0000000 0.20584937 1.0000000 0.04922737 0.3048895
# [6,] 0.0911130 1.00000000 1.0000000 0.95559194 0.2776121
# [7,] 1.0000000 0.49217547 1.0000000 1.00000000 0.4660682
# [8,] 1.0000000 0.08411907 1.0000000 0.43281587 0.9319249
# [9,] 1.0000000 0.65584144 0.5783195 0.63636202 0.5319453
#[10,] 0.9132295 1.00000000 0.2942273 0.46325157 1.0000000

0人赞添加讨论(0) 举报

forever°为你锁心

3楼-- · 2019-02-26 07:46

Just use temp_mat <- pmin(temp_mat, 1). See ?pmin for more use of parallel minima.

Example:

set.seed(0); A <- matrix(sample(1:3, 25, replace = T), 5)
#> A
#     [,1] [,2] [,3] [,4] [,5]
#[1,]    3    1    1    3    3
#[2,]    1    3    1    2    3
#[3,]    2    3    1    3    1
#[4,]    2    2    3    3    2
#[5,]    3    2    2    2    1
B <- pmin(A, 2)
#> B
#     [,1] [,2] [,3] [,4] [,5]
#[1,]    2    1    1    2    2
#[2,]    1    2    1    2    2
#[3,]    2    2    1    2    1
#[4,]    2    2    2    2    2
#[5,]    2    2    2    2    1

update

Since you have background in computational science, I would like to provide more information.

pmin is fast, but is far from high performance. Its prefix "parallel" only suggests element-wise. The meaning of "vectorization" in R is not the same as "SIMD vectorization" in HPC. R is an interpreted language, so "vectorization" in R means opting for C level loop rather than R level loop. Therefore, pmin is just coded with a trivial C loop.

Real high performance computing should benefit from SIMD vectorization. I believe you know SSE/AVX intrinsics. So if you write a simple C code, using _mm_min_pd from SSE2, you will get ~2 times speedup from pmin; if you see _mm256_min_pd from AVX, you will get ~4 times speedup from pmin.

Unfortunately, R itself can not do any SIMD. I have an answer to a post at Does R leverage SIMD when doing vectorized calculations? regarding this issue. For your question, even if you link your R to a HPC BLAS, pmin will not benefit from SIMD, simply because pmin does not involve any BLAS operations. So a better bet is to write compiled code yourself.

0人赞添加讨论(0) 举报

Vectorize min() for matrix

update

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间