How to make a computation loop easily splittable a

2019-06-06 14:02发布

问题:

I want to find optimal parameters i, j, k in 0..99 for a given computational problem, and I need to run:

for i in range(100):
    for j in range(100):
        for k in range(100):
            dothejob(i, j, k)    # 1 second per computation

This takes a total of 10^6 seconds, i.e. 11.5 days.

I started doing it by splitting the work among 4 processes (to use 100% computing power of my 4-core CPU computer):

for i in range(100):
    if i % 4 != 0:      #  replace != 0 by 1, 2, or 3 for the parallel scripts #2, #3, #4
        continue
    for j in range(100):
        for k in range(100):
            dothejob(i, j, k)

        with open('done.log', 'a+') as f:    # log what has been done
            f.write("%i %i\n" % (i, j))

But I have problems with this approach:

  1. I have to run python script.py, then open script.py, replace line 2 by if i % 4 != 1, then run python script.py, then open script.py, replace line 2 by if i % 4 != 2, then run python script.py, then open script.py, replace line 2 by if i % 4 != 3, then run python script.py.

  2. Let's say the loop is interrupted (need to reboot computer, or crash or anything else, etc.). At least we know all the (i, j) already done in done.log (so we don't need to start from 0 again), but there's no easy way to resume the work. (OK we can open done.log, parse it, discard the (i, j) already done when restarting the loops, I started doing this - but I had the feeling to reinvent, in a dirty way, something already existing)

I'm looking for a better solution for this (but for example map/reduce might be an overkill for this little task, and not easy to use in a few lines in Python).

Question: How to make a computation for i in range(100): for j in range(100): for k in range(100): dothejob(i, j, k) easily splittable among multiple processes and easily resumable (e.g. after reboot) in Python?

回答1:

Just map the product using a pool of processes, example:

import itertools as it
from multiprocessing import Pool
the_args = it.product(range(100), range(100), range(100))
pool = Pool(4)

def jobWrapper(args): #we need this to unpack the (i, j, k) tuple 
    return dothejob(*args)

res = pool.map(jobWrapper, the_args)

If you want to resume it, knowing the las (i, j, k) from the log, just skip all previously computed from the_args:

the_args = it.product(range(100), range(100), range(100))
#skip previously computed 
while True:
    if next(the_args) == (i, j, k):
        break
...

Being (i, j, k) the tuple with the las computed values.