Is there a difference between nested parallelism a

2019-06-07 05:15发布

I know that enabling nested parallelism will allow for a nested omp parallel for loop to also be parallelized. But I use collapse(2) in my nested for loops (for inside of for) instead.

Is there a difference? Why or why not? Assume the best case scenario: no dependence between the loop indices and other things equal.

1条回答
SAY GOODBYE
2楼-- · 2019-06-07 05:53

Yes there is a huge difference - use collapse (not collapsed). Do not use nested parallelism.

Nested parallelism means that there are independent teams of threads working on the different levels of worksharing. You can run into all sorts of trouble either with oversubscribing CPU cores to too many threads - or not utilizing CPU cores because some threads are in the wrong team which has no work right now. It's rather hard to get decent performance out of nested parallelism. This is why you usually need to explicitly enable it.

Collapsing loops on the other hand means that the different loops are joint on a work-sharing level. This allows one team of threads (usually with as many threads as available CPU cores) to efficiently work the different iterations of the loops.

查看更多
登录 后发表回答