I want to use parallel arrays for a task, and before I start with the coding, I'd be interested in knowing if this small snipept is threadsafe:
import collection.mutable._
var listBuffer = ListBuffer[String]("one","two","three","four","five","six","seven","eight","nine")
var jSyncList = java.util.Collections.synchronizedList(new java.util.ArrayList[String]())
listBuffer.par.foreach { e =>
println("processed :"+e)
// using sleep here to simulate a random delay
Thread.sleep((scala.math.random * 1000).toLong)
jSyncList.add(e)
}
jSyncList.toArray.foreach(println)
Are there better ways of processing something with parallel collections, and acumulating the results elsewhere?
The code you posted is perfectly safe; I'm not sure about the premise though: why do you need to accumulate the results of a parallel collection in a non-parallel one? One of the whole points of the parallel collections is that they look like other collections.
I think that parallel collections also will provide a
seq
method to switch to sequential ones. So you should probably use this!The code you've posted is safe - there will be no errors due to inconsistent state of your array list, because access to it is synchronized.
However, parallel collections process items concurrently (at the same time), AND out-of-order. The out-of-order means that the 54. element may be processed before the 2. element - your synchronized array list will contain items in non-predefined order.
In general it's better to use
map
,filter
and other functional combinators to transform a collection into another collection - these will ensure that the ordering guarantees are preserved if a collection has some (likeSeq
s do). For example:always returns
ParArray(2, 3, 4, 5)
.However, if you need a specific thread-safe collection type such as a
ConcurrentSkipListMap
or a synchronized collection to be passed to some method in some API, modifying it from a parallel foreach is safe.Finally, a note - parallel collection provide parallel bulk operations on data. Mutable parallel collections are not thread-safe in the sense that you can add elements to them from different threads. Mutable operations like insertion to a map or appending a buffer still have to be synchronized.
For this pattern to be safe:
f
has to be able to run concurrently in a safe way. I think the same rules that you need for safe multi-threading apply (access to share state needs to be thread safe, the order of thef
calls for differente
won't be deterministic and you may run into deadlocks as you start synchronizing your statements inf
).Additionally I'm not clear what guarantees the parallel collections gives you about the underlying collection being modified while being processed, so a mutable list buffer which can have elements added/removed is possibly a poor choice. You never know when the next coder will call something like
foo(listBuffer)
before yourforeach
and pass that reference to another thread which may mutate the list while it's being processed.Other than that, I think for any
f
that will take a long time, can be called concurrently and wheree
can be processed out of order, this is a fine pattern.disclaimer: I have not tried // colls myself, but I'm looking forward at having SO questions/answers show us what works well.
This code is plain weird -- why add stuff in parallel to something that needs to be synchronized? You'll add contention and gain absolutely nothing in return.
The principle of the thing -- accumulating results from parallel processing, are better achieved with stuff like
fold
,reduce
oraggregate
.The
synchronisedList
should be safe, though theprintln
may give unexpected results - you have no guarantees of the order that items will be printed, or even that your printlns won't be interleaved mid-character.A synchronised list is also unlikely to be the fastest way you can do this, a safer solution is to
map
over an immutable collection (Vector
is probably your best bet here), then print all the lines (in order) afterwards:You'll also note that this code has about as much practical usefulness as your example :)