I am using ThreadPoolExecutor class from the concurrent.futures package
def some_func(arg):
# does some heavy lifting
# outputs some results
from concurrent.futures import ThreadPoolExecutor
with ThreadPoolExecutor(max_workers=1) as executor:
for arg in range(10000000):
future = executor.submit(some_func, arg)
but I need to limit the queue size somehow, as I don't want millions of futures to be created at once, is there a simple way to do it or should I stick to queue.Queue and threading package to accomplish this?
you should use semaphore like this https://www.bettercodebytes.com/theadpoolexecutor-with-a-bounded-queue-in-python/
there is some thing wrong with andres.riancho's answer,that if we set max_size of queue,when we shutdown the pool, self._work_queue.put(None) can not put limited by max_size,so our poll will never exsit.
I've been doing this by chunking the range. Here's a working example.
The downside to this approach is the chunks are synchronous. All threads in a chunk must complete before the next chunk is added to the pool.
Python's
ThreadPoolExecutor
doesn't have the feature you're looking for, but the provided class can be easily sub-classed as follows to provide it: