I've been looking at the new Scala 2.9 parallel collections and am hoping to abandon a whole lot of my crufty amateur versions of similar things. In particular, I'd like to replace the fork join pool which underlies the default implementation with something of my own (for example, something that distributes evaluation of tasks across a network, via actors). My understanding is that this is simply a matter of applying Scala's paradigm of "stackable modifications", but the collections library is intimidating enough that I'm not exactly sure which bits need modifying!
Some concrete questions:
- Is it correct that the standard parallel implementations interact with the fork join pool solely through the code in
ForkJoinTasks
? - I see that there's an alternative trait,
FutureThreadPoolTasks
. How would I build a collection which uses this trait instead ofForkJoinTasks
? - Can I just write yet another alternative (and perhaps a corresponding boilerplate class that mixes in
AdaptiveWorkStealingTasks
and somehow instantiate collections instances that use this new trait?
(For reference, all of the traits mentioned above are defined in Tasks.scala.)
Especially code examples are very welcome!
Just to provide some more information on how things fit together (which I suspect you already know): the fork-join pool is "plugged in" via the
parallel
package object'stasksupport
value which implements thescala.collection.parallel.TaskSupport
trait.This, in turn, inherits from
Tasks
(which you mention) and defines such operations as:However, it's not immediately obvious to me how you can override the behaviour which is explicitly imported by the collections themselves by supplying your own
TaskSupport
implementation. For example, inParSeqLike
line 47:In fact,I would go so far as saying it looks like the parallelism is definitively not overridable (unless I am very much mistaken, though I often am).
Here is a document describing how to switch
TaskSupport
objects in Scala 2.10.