Java 5 has introduced support for asynchronous task execution by a thread pool in the form of the Executor framework, whose heart is the thread pool implemented by java.util.concurrent.ThreadPoolExecutor. Java 7 has added an alternative thread pool in the form of java.util.concurrent.ForkJoinPool.
Looking at their respective API, ForkJoinPool provides a superset of ThreadPoolExecutor's functionality in standard scenarios (though strictly speaking ThreadPoolExecutor offers more opportunities for tuning than ForkJoinPool). Adding to this the observation that
fork/join tasks seem to be faster (possibly due to the work stealing scheduler), need definitely fewer threads (due to the non-blocking join operation), one might get the impression that ThreadPoolExecutor has been superseded by ForkJoinPool.
But is this really correct? All the material I have read seems to sum up to a rather vague distinction between the two types of thread pools:
- ForkJoinPool is for many, dependent, task-generated, short, hardly ever blocking (i.e. compute-intensive) tasks
- ThreadPoolExecutor is for few, independent, externally-generated, long, sometimes blocking tasks
Is this distinction correct at all? Can we say anything more specific about this?
ThreadPool (TP) and ForkJoinPool (FJ) are targeted towards different use cases. The main difference is in the number of queues employed by the different executors which decide what type of problems are better suited to either executor.
The FJ executor has n (aka parallelism level) separate concurrent queues (deques) while the TP executor has only one concurrent queue (these queues/deques maybe custom implementations not following the JDK Collections API). As a result, in scenarios where you have a large number of (usually relatively short running) tasks generated, the FJ executor will perform better as the independent queues will minimize concurrent operations and infrequent steals will help with load balancing. In TP due to the single queue, there will be concurrent operations every time work is dequeued and it will act as a relative bottleneck and limit performance.
In contrast, if there are relatively fewer long-running tasks the single queue in TP is no longer a bottleneck for performance. However, the n-independent queues and relatively frequent work-stealing attempts will now become a bottleneck in FJ as there can be possibly many futile attempts to steal work which add to overhead.
In addition, the work-stealing algorithm in FJ assumes that (older) tasks stolen from the deque will produce enough parallel tasks to reduce the number of steals. E.g. in quicksort or mergesort where older tasks equate to larger arrays, these tasks will generate more tasks and keep the queue non-empty and reduce the number of overall steals. If this is not the case in a given application then the frequent steal attempts again become a bottleneck. This is also noted in the javadoc for ForkJoinPool:
this class provides status check methods (for example getStealCount())
that are intended to aid in developing, tuning, and monitoring
fork/join applications.
Recommended Reading http://gee.cs.oswego.edu/dl/jsr166/dist/docs/
From the docs for ForkJoinPool:
A ForkJoinPool differs from other kinds of ExecutorService mainly by
virtue of employing work-stealing: all threads in the pool attempt to
find and execute tasks submitted to the pool and/or created by other
active tasks (eventually blocking waiting for work if none exist).
This enables efficient processing when most tasks spawn other subtasks
(as do most ForkJoinTasks), as well as when many small tasks are
submitted to the pool from external clients. Especially when setting
asyncMode to true in constructors, ForkJoinPools may also be
appropriate for use with event-style tasks that are never joined.
The fork join framework is useful for parallel execution while executor service allows for concurrent execution and there is a difference. See this and this.
The fork join framework also allows for work stealing (usage of a Deque).
This article is a good read.
AFAIK, ForkJoinPool
works best if you a large piece of work and you want it broken up automatically. ThreadPoolExecutor
is a better choice if you know how you want the work broken up. For this reason I tend to use the latter because I have determined how I want the work broken up. As such its not for every one.
Its worth nothing that when it comes to relatively random pieces of business logic a ThreadPoolExecutor will do everything you need, so why make it more complicated than you need.
Let's compare the differences in constructors:
ThreadPoolExecutor
ThreadPoolExecutor(int corePoolSize,
int maximumPoolSize,
long keepAliveTime,
TimeUnit unit,
BlockingQueue<Runnable> workQueue,
ThreadFactory threadFactory,
RejectedExecutionHandler handler)
ForkJoinPool
ForkJoinPool(int parallelism,
ForkJoinPool.ForkJoinWorkerThreadFactory factory,
Thread.UncaughtExceptionHandler handler,
boolean asyncMode)
The only advantage I have seen in ForkJoinPool
: Work stealing mechanism by idle threads.
Java 8 has introduced one more API in Executors - newWorkStealingPool to create work stealing pool. You don't have to create RecursiveTask
and RecursiveAction
but still can use ForkJoinPool
.
public static ExecutorService newWorkStealingPool()
Creates a work-stealing thread pool using all available processors as its target parallelism level.
Advantages of ThreadPoolExecutor over ForkJoinPool:
- You can control task queue size in
ThreadPoolExecutor
unlike in ForkJoinPool
.
- You can enforce Rejection Policy when you ran out of your capacity unlike in
ForkJoinPool
I like these two features in ThreadPoolExecutor
which keeps health of system in good state.
EDIT:
Have a look at this article for use cases of various types of Executor Service thread pools and evaluation of ForkJoin Pool features.