in scala, i have a parallel Iterable of items and i want to iterate over them and aggregate the results in some way, but in order. i'll simplify my use case and say that we start with an Iterable of integers and want to concatenate the string representation of them in paralle, with the result in order.
is this possible with either fold or aggregate? it's unclear from the documentation which methods work parallelized but maintain order.
Yes, order is gauranteed to be preserved for fold/aggregate/reduce operations on parallel collections. This is not very well documented. The trick is that the operation you which to fold over must be associative (and thus capable of being arbitrarily split up and recombined), but need not be commutative (and so not capable of being safely reordered). String concatenation is a perfect example of an associative, non-commutative operation, so the fold can be done in parallel.
val concat = myParallelList.map(_.toString).reduce(_+_)
For folds: foldRight
and foldLeft
cannot be processed in parallel, you'll need to use the new fold
method (more info there).
Like fold
, aggregate
can do its work in parallel: it “traverses the elements in different partitions sequentially” (Scaladoc), though it looks like you have no direct influence on how the partitions are chosen.
I THINK the preservation of 'order' in the sense of the comment to Jean-Philippe Pellets answer is guaranteed due to the way parallel collections are implemented according to a publication of Odersky (http://infoscience.epfl.ch/record/150220/files/pc.pdf) IFF the part that splits your collection is behaving well with respect to order.
i.e. if you have elements a < b < c and a and c end up in one partition it follows that b is in the same partition as well.
I don't remember what exactly was the part responsible for the splitting, but if you find it, you might sufficient information in its documentation or source code in order to answer your question.