I am running a function for several sets of iterables, returning a list of all results as soon as all processes are finished.
def fct(variable1, variable2):
# do an operation that does not necessarily take the same amount of
# time for different input variables and yields result1 and result2
return result1, result2
variables1 = [1,2,3,4]
variables2 = [7,8,9,0]
with ThreadPoolExecutor(max_workers = 8) as executor:
future = executor.map(fct,variables1,variables2)
print '[%s]' % ', '.join(map(str, future))
>>> [ (12,3) , (13,4) , (14,5) , (15,6) ]
How can I print intermediary results e.g. for variable1 = 1, variable2 = 7 as soon as their results are calculated?
map
already does this, but join
needs to consume the entire iterable in order to create the joined string. Changing this to a for
loop will let you print it incrementally:
for i in executor.map(fct, v1, v2):
print(str(i))
Keeping the same output as the join
code is a bit more work, but doable regardless:
first = True
print("[ ", end="")
for i in executor.map(fct, v1, v2):
if first:
first = False
else:
print(" , ", end="")
print(str(i), end="")
print("]", end="")
If you want to consume the results as they finish without preserving the order of the original iterable, you can use executor.submit
along with concurrent.futures.as_completed
:
from concurrent.futures import ThreadPoolExecutor, as_completed
import time
import random
def fct(variable1, variable2):
time.sleep(random.randint(1,5))
return variable1+1, variable2+1
variables1 = [1,2,3,4]
variables2 = [7,8,9,0]
with ThreadPoolExecutor(max_workers = 8) as executor:
for out in as_completed([executor.submit(fct,*vars)
for vars in zip(variables1, variables2)]):
print(out.result())
Output (though any order is possible on any given run, due to random.randint
):
(4, 10)
(5, 1)
(2, 8)
(3, 9)
as_completed
will yield a Future
from its input list as soon as that Future
is marked as done, regardless of where it actually falls in the input list. This way, if the second item is done after 2 seconds, but the first takes fifteen, you'll see the result of the second items after two seconds, rather than needing to wait fifteen. This may or may not be desirable behavior, depending on your specific use-case.
Edit:
Note that you can still get the output in the original order this way. You just need to save the list you give to as_completed
:
with ThreadPoolExecutor(max_workers = 8) as executor:
jobs = [executor.submit(fct, *vars)
for vars in zip(variables1, variables2)]
for out in as_completed(jobs):
print(out.result())
results = [r.result() for r in jobs]
print(results)
Output:
(5, 1)
(2, 8)
(3, 9)
(4, 10)
[(2, 8), (3, 9), (4, 10), (5, 1)]