How to print results of Python ThreadPoolExecutor.

2019-05-24 23:26发布

问题:

I am running a function for several sets of iterables, returning a list of all results as soon as all processes are finished.

def fct(variable1, variable2):

   # do an operation that does not necessarily take the same amount of
   # time for different input variables and yields result1 and result2

   return result1, result2

variables1 = [1,2,3,4]
variables2 = [7,8,9,0]

with ThreadPoolExecutor(max_workers = 8) as executor:
    future = executor.map(fct,variables1,variables2)
    print '[%s]' % ', '.join(map(str, future))

>>> [ (12,3) , (13,4) , (14,5) , (15,6) ]

How can I print intermediary results e.g. for variable1 = 1, variable2 = 7 as soon as their results are calculated?

回答1:

map already does this, but join needs to consume the entire iterable in order to create the joined string. Changing this to a for loop will let you print it incrementally:

for i in executor.map(fct, v1, v2):
    print(str(i))

Keeping the same output as the join code is a bit more work, but doable regardless:

first = True
print("[ ", end="")
for i in executor.map(fct, v1, v2):
    if first:
        first = False
    else:
        print(" , ", end="")

    print(str(i), end="")
print("]", end="")


回答2:

If you want to consume the results as they finish without preserving the order of the original iterable, you can use executor.submit along with concurrent.futures.as_completed:

from concurrent.futures import ThreadPoolExecutor, as_completed
import time
import random

def fct(variable1, variable2):
   time.sleep(random.randint(1,5))
   return variable1+1, variable2+1

variables1 = [1,2,3,4]
variables2 = [7,8,9,0]

with ThreadPoolExecutor(max_workers = 8) as executor:
    for out in as_completed([executor.submit(fct,*vars) 
                                for vars in zip(variables1, variables2)]):
        print(out.result())

Output (though any order is possible on any given run, due to random.randint):

(4, 10)
(5, 1)
(2, 8)
(3, 9)

as_completed will yield a Future from its input list as soon as that Future is marked as done, regardless of where it actually falls in the input list. This way, if the second item is done after 2 seconds, but the first takes fifteen, you'll see the result of the second items after two seconds, rather than needing to wait fifteen. This may or may not be desirable behavior, depending on your specific use-case.

Edit:

Note that you can still get the output in the original order this way. You just need to save the list you give to as_completed:

with ThreadPoolExecutor(max_workers = 8) as executor:
    jobs = [executor.submit(fct, *vars) 
               for vars in zip(variables1, variables2)]

    for out in as_completed(jobs):
        print(out.result())
    results = [r.result() for r in jobs]
    print(results)

Output:

(5, 1)
(2, 8)
(3, 9)
(4, 10)
[(2, 8), (3, 9), (4, 10), (5, 1)]