I have a sum that I'm trying to compute, and I'm having difficulty parallelizing the code. The calculation I'm trying to parallelize is kind of complex (it uses both numpy arrays and scipy sparse matrices). It spits out a numpy array, and I want to sum the output arrays from about 1000 calculations. Ideally, I would keep a running sum over all the iterations. However, I haven't been able to figure out how to do this.
So far, I've tried using joblib's Parallel function and the pool.map function with python's multiprocessing package. For both of these, I use an inner function that returns a numpy array. These functions return a list, which I convert to a numpy array and then sum over.
However, after the joblib Parallel function completes all iterations, the main program never continues running (it looks like the original job is in a suspended state, using 0% CPU). When I use pool.map, I get memory errors after all the iterations are complete.
Is there a way to simply parallelize a running sum of arrays?
Edit: The goal is to do something like the following, except in parallel.
def summers(num_iters):
sumArr = np.zeros((1,512*512)) #initialize sum
for index in range(num_iters):
sumArr = sumArr + computation(index) #computation returns a 1 x 512^2 numpy array
return sumArr
I'm not sure I understand the problem. Are you just trying to partition a list onto a pool of workers, have them keep a running sum of their computations, and sum the result?
(the partition function coming from Python: Slicing a list into n nearly-equal-length partitions )
I figured out how to do parallelize a sum of arrays with multiprocessing, apply_async, and callbacks, so I'm posting this here for other people. I used the example page for Parallel Python for the Sum callback class, although I did not actually use that package for implementation. It gave me the idea of using callbacks, though. Here's the simplified code for what I ended up using, and it does what I wanted it to do.
I was also able to get this working using a parallelized map, which was suggested in another answer. I had tried this earlier, but I wasn't implementing it correctly. Both ways work, and I think this answer explains the issue of which method to use (map or apply.async) pretty well. For the map version, you don't need to define the class Sum and the summers function becomes