Python Multiprocessing Script Freezes Seemingly Wi

2019-02-26 20:37发布

问题:

I am trying to use the multiprocessing package to call a function (let's call it myfunc) in parallel, specifically using pool.map i.e. pool.map(myfunc, myarglist). When I simply loop over myarglist without using multiprocessing there are no errors, which should be the case because all operations in myfunc are called within a try block. However, when I call the function using pool.map the script invariably stops running, i.e. it stop printing a "myfunc done!" statement within my function and the processes stop using the CPUs, but it never returns resultlist. I am running python 2.7 from the terminal in ubuntu 12.04. What could cause this to occur and how should I fix/troubleshoot the problem?

cpu_count = int(multiprocessing.cpu_count())
pool = Pool(processes = cpu_count)
resultlist = pool.map(myfunc, myarglist)
pool.close()

Update One issue when using multiprocessing can be the size of the object, if you think that may be a problem see this answer. As the answer notes "If this [solution] doesn't work, maybe the stuff you're returning from your functions is not pickleable, and therefore unable to make it through the Queues properly." Multiprocessing passes objects between processes by pickling them. It turns out that one or two of my objects had soup from BeautifulSoup that would not pickle.

回答1:

Check whether all the processes are started or not.This will help you to debug it.Also add Pool.join() at the end of your code.

This is a sample code

def start_process():
    print 'Starting', multiprocessing.current_process().name

if __name__ == '__main__':

    pool_size =2
    pool = multiprocessing.Pool(processes=pool_size,
                                initializer=start_process,
                                )

    pool_outputs = pool.map(function_name,argument_list)
    pool.close() # no more tasks
    pool.join()  # wrap up current tasks