I have a rather massive number of threads being created inside a clojure program:
(import '(java.util.concurrent Executors))
(def *pool*
(Executors/newCachedThreadPool))
(defn do-something []
; work
Thread/sleep 200
; repeat)
(dotimes [i 10000]
(.submit *pool* do-something))
It's been a while between JVMs for me and I am basically wondering here if there is any argument against using sleep or yield inside the function that is being executed by the Executor? If I understand correctly, in this case, every one of my workers has it's own thread and therefore there should be no side effects.
If the Executor is using a FixedThreadPool:
(Executors/newFixedThreadPool 1000)
Things become more complicated because threads will not be returned to the pool until their work is complete, meaning the other queued workers will take longer to complete if the threads are sleeping.
Is my understanding of threading in this instance correct?
(Note: I suspect my design is actually wrong, but just want to make sure I am on the right page)
An executor is conceptually a task queue + a worker pool. Your explanation of what will happen here is basically correct. When you submit a task to the executor, the work is enqueued until a thread can execute the task. When it is executing the task, that task owns the thread and sleeping will block other tasks from being executed on that worker thread.
Depending on what you're doing that may be ok (although it is unusual and probably bad form to sleep inside a task). It's more common to block a thread as a side effect of waiting on IO (blocked on a socket or db call for example).
Generally if you are doing periodic work, it is better to handle that outside the pool and fire tasks when they should be executed, or better yet, use a ScheduledExecutorService instead from Executors/newScheduledThreadPool.
The other main mechanism in Java for performing time-based tasks is java.util.Timer, which is a bit easier to use but not as robust as the ScheduledExecutorService.
Another alternative from Clojure is to explicitly put the worker into a background thread managed by Clojure instead of by you:
An alternative to sleeping your threads is to have each worker have a "sleepUntil" long value. When your executor calls a worker, if it is sleeping it returns immediately. Otherwise, it does its work, then returns. This can help keep your thread count down, because a FixedThreadPoolExecutor will be able to handle many more workers than it has threads, if most of them are flagged as sleeping and return quickly.