Scala Parallel Collections: How to know and config

2019-07-22 09:41发布

I am using scala parallel collections.

val largeList = list.par.map(x => largeComputation(x)).toList

It is blazing fast, but I have a feeling that I may run into out-of-memory issues if we run too may "largeComputation" in parallel.

Therefore when testing, I would like to know how many threads is the parallel collection using and if-need-be, how can I configure the number of threads for the parallel collections.

1条回答
看我几分像从前
2楼-- · 2019-07-22 10:08

Here is a piece of scaladoc where they explain how to change the task support and wrap inside it the ForkJoinPool. When you instantiate the ForkJoinPool you pass as the parameter desired parallelism level:

Here is a way to change the task support of a parallel collection:

import scala.collection.parallel._
val pc = mutable.ParArray(1, 2, 3)
pc.tasksupport = new ForkJoinTaskSupport(new scala.concurrent.forkjoin.ForkJoinPool(2))

So for your case it will be

val largeList = list.par
largerList.tasksupport = new ForkJoinTaskSupport(
  new scala.concurrent.forkjoin.ForkJoinPool(x)
)
largerList.map(x => largeComputation(x)).toList
查看更多
登录 后发表回答