Here's my understanding of the Stream framework of Java 8:
- Something creates a source Stream
- The implementation is responsible for providing a BaseStream#parallel() method, which in turns returns a Stream that can run it's operations in parallel.
While someone has already found a way to use a custom thread pool with Stream framework's parallel executions, I cannot for the life of me find any mention in the Java 8 API that the default Java 8 parallel Stream implementations would use ForkJoinPool#commonPool(). (Collection#parallelStream(), the methods in StreamSupport class, and others possible sources of parallel-enabled streams in the API that I don't know about).
Only tidbits that I could gleam off search results were these:
State of the Lambda: Libraries Edition ("Parallelism under the hood")
Vaguely mentions the Stream framework and the Fork/Join machinery.
The Fork/Join machinery is designed to automate this process.
JEP 107: Bulk Data Operations for Collections
Almost directly states that the the Collection interface's default method #parallelStream() implements itself using Fork/Join. But still nothing about common pool.
The parallel implementation builds upon the java.util.concurrency Fork/Join implementation introduced in Java 7.
and hence: Collection#parallelStream().
Class Arrays (Javadoc)
Directly states multiple times that the common pool is used.
The ForkJoin common pool is used to execute any parallel tasks.
So my question is:
Where is it said that the ForkJoinPool#commonPool() is used for parallel operations on streams that are obtained from the Java 8 API?
W.r.t. where is it documented that Java 8 parallel streams use FJ Framework?
Afaik (Java 1.8u5) it is not mentioned in the JavaDoc of parallel streams that a common ForkJoinPool is used.
But it is mentioned in the ForkJoin documentation at the bottom of
http://docs.oracle.com/javase/tutorial/essential/concurrency/forkjoin.html
W.r.t. replacing the Thread pool
My understanding is that you can use a custom ForkJoinPool (instead of the common one)
- see Custom thread pool in Java 8 parallel stream -, but not a custom ThreadPool which is different from the ForkJoin implementation (I have an open question here: How to (globally) replace the common thread pool backend of Java parallel streams? )
W.r.t. replacing the Streams api
You may checkout https://github.com/nurkiewicz/LazySeq which is a more Scala like streams implementation - very nice, very interesting
PS (w.r.t. ForkJoin and Streams)
If you are interested, I would like to note that I stumbled across some issues with the use of the FJ pool, see, e.g.
- Nested Java 8 parallel forEach loop perform poor. Is this behavior expected?
- Using a semaphore inside a nested Java 8 parallel stream action may DEADLOCK. Is this a bug?
For what it's worth, Java 8 in Action has a chapter on Parallel data processing and performance (Chapter 7). It says:
"...the Stream interface gives you the opportunity to execute
operations in parallel on a collection of data without much effort."
"...you’ll see how Java can make this magic happen or, more
practically, how parallel streams work under the hood by employing the
fork/join framework introduced in Java 7."
It also has a small side note in section 7.1:
"Parallel streams internally use the default ForkJoinPool...which by default has as many threads as you have
processors, as returned by Runtime.getRuntime().availableProcessors().
"
"you can change the size of this pool using the system property java.util
.concurrent.ForkJoinPool.common.parallelism, as in the following example:"
System.setProperty("java.util.concurrent.ForkJoinPool.common.parallelism","12");
As mentioned in the comments and other answers, this does not mean it will always use the fork/join.
You can check source code of terminal operations on GrepCode. For example, lets take a look at ForEachOp. As you can see evaluateParallel method of ForEachOp creates and invokes ForEachTask object which is derived from CountedCompleter derived from ForkJoinTask.