Let's run the following line of code several times:
Set(1,2,3,4,5,6,7).par.fold(0)(_ - _)
The results are quite interesting:
scala> Set(1,2,3,4,5,6,7).par.fold(0)(_ - _)
res10: Int = 8
scala> Set(1,2,3,4,5,6,7).par.fold(0)(_ - _)
res11: Int = 20
However clearly it should be like in its sequential version:
scala> Set(1,2,3,4,5,6,7).fold(0)(_ - _)
res15: Int = -28
I understand that operation -
is non-associative on integers and that's the reason behind such behavior, but my question is quite simple: doesn't it mean that fold
should not be parallelized in .par
implementation of collections?
When you look at the standard library documentation, you see that
fold
is undeterministic here:As an alternative, there's
foldLeft
:As
Set
is not an ordered collection, there's no canonical order in which the elements could be folded, so the standard library allows itself to be undeterministic even forfoldLeft
. If you would use an ordered sequence here,foldLeft
would be deterministic in that case.The scaladoc does say:
So, as you stated, a binary operation applied in
ParSet#fold
that is not associative is not guaranteed to produce a deterministic result. The above text is warning is all you get.Does that mean
ParSet#fold
(and cousins) should not be parallelized? Not exactly. If your binary operation is commutative and you don't care about non-determinism of side-effects (not that afold
should have any), then there isn't a problem. However, you are hit with the caveat of needing to tread carefully around parallel collections.Whether or not it is correct is more of a matter of opinion. One could argue that if a method can result in accidental non-determinism, that it should not exist in a language or library. But the alternative is to clip out functionality so that
ParSet
is missing functionality that is present in most of the other collection implementations. You could use the same line of thinking to also suggest the removal ofStream#foreach
to prevent people from accidentally triggering infinite loops on infinite streams, but should you?