Multiprocessing has cutoff at 992 integers being j

2019-09-03 19:46发布

问题:

I am following this book http://doughellmann.com/pages/python-standard-library-by-example.html

Along with some online references. I have some algorithm setup for multiprocessing where i have a large array of dictionaries and do some calculation. I use multiprocessing to divide the indexes on which the calculations are done on the dictionary. To make the question more general, I replaced the algorithm with just some array of return values. From finding information online and other SO, I think it has to do with the join method.

The structure is like so,

Generate some fake data, call the manager function for multiprocessing, create a Queue, divide data over the number of index. Loop through the number of processes to use, send each process function the correct index range. Lastly join the processes and print out the results.

What I have figured out, is if the function used by the processes is trying to return a range(0,992), it works quickly, if the range(0,993), it hangs. I tried on two different computers with different specs.

The code is here:

import multiprocessing

def main():
    data = []
    for i in range(0,10):
        data.append(i)

    CalcManager(data,start=0,end=50)

def CalcManager(myData,start,end):
    print 'in calc manager'
    #Multi processing
    #Set the number of processes to use.  
    nprocs = 3
    #Initialize the multiprocessing queue so we can get the values returned to us
    tasks = multiprocessing.JoinableQueue()
    result_q = multiprocessing.Queue()
    #Setup an empty array to store our processes
    procs = []
    #Divide up the data for the set number of processes 
    interval = (end-start)/nprocs 
    new_start = start
    #Create all the processes while dividing the work appropriately
    for i in range(nprocs):
        print 'starting processes'
        new_end = new_start + interval
        #Make sure we dont go past the size of the data 
        if new_end > end:
            new_end = end 
        #Generate a new process and pass it the arguments 
        data = myData[new_start:new_end]
        #Create the processes and pass the data and the result queue
        p = multiprocessing.Process(target=multiProcess,args=(data,new_start,new_end,result_q,i))
        procs.append(p)
        p.start()
        #Increment our next start to the current end 
        new_start = new_end+1
    print 'finished starting'    
    #Joint the process to wait for all data/process to be finished
    for p in procs:
        p.join()

    #Print out the results
    for i in range(nprocs):
        result = result_q.get()
        print result

#MultiProcess Handling
def multiProcess(data,start,end,result_q,proc_num):
    print 'started process'
    results = range(0,(992))
    result_q.put(results)
    return

if __name__== '__main__':   
    main()

Is there something about these numbers specifically or am I just missing something basic that has nothing to do with these numbers?

From my searches, it seems this is some memory issue with the join method, but the book does not really explain how to solve this using this setup. Is it possible to use this structure (i understand it mostly, so it would be nice if i can continue to use this) and also pass back large results. I know there are other methods to share data between processes, but thats not what I need, just return the values and join them to one array once completed.

回答1:

I can't reproduce this on my machine, but it sounds like items in put into the queue haven't been flushed to the underlying pipe. This will cause a deadlock if you try to terminate the process, according to the docs:

As mentioned above, if a child process has put items on a queue (and it has not used JoinableQueue.cancel_join_thread), then that process will not terminate until all buffered items have been flushed to the pipe. This means that if you try joining that process you may get a deadlock unless you are sure that all items which have been put on the queue have been consumed. Similarly, if the child process is non-daemonic then the parent process may hang on exit when it tries to join all its non-daemonic children.

If you're in this situation. your p.join() calls will hang forever, because there's still buffered data in the queue. You can avoid it by consuming from the queue before you join the processes:

#Print out the results
for i in range(nprocs):
    result = result_q.get()
    print result

#Joint the process to wait for all data/process to be finished
for p in procs:
    p.join()

This doesn't affect the way the code works, each result_q.get() call will block until the result is placed on the queue, which has the same effect has calling join on all processes prior to calling get. The only difference is you avoid the deadlock.