Parallelize for loop in python

2019-08-01 10:09发布

问题:

I have a genetic algorithm which I would like to speed up. I'm thinking the easiest way to achieve this is by pythons multiprocessing module. After running cProfile on my GA, I found out that most of the computational time takes place in the evaluation function.

def evaluation():
    scores = []
    for chromosome in population:
        scores.append(costly_function(chromosome))

How would I go about to parallelize this method? It is important that all the scores append in the same order as they would if the program would run sequentially.

I'm using python 2.7

回答1:

Use pool (I show both imap and map because of some results on google say map may not be OK for ordering though I have yet to see proof):

from multiprocessing import Pool
def evaluation(population):
    return list(Pool(processes=nprocs).imap(costly_function,population))

or (what I use):

 return Pool(processes=nprocs).map(costly_function,population)

Define nprocs to the number of parallel process you want.

From: https://docs.python.org/dev/library/multiprocessing.html#multiprocessing.pool.Pool