I'm trying to implement a function that uses python multiprocessing
in order to speed-up a calculation. I'm trying to create a pairwise distance matrix but the implementation with for loops takes more than 8 hours.
This code seems to work faster but when I print the matrix is full of zeros. When I print the rows in the function it seems to work. I think is a scope problem but I cannot understand how to deal with it.
import multiprocessing
import time
import numpy as np
def MultiProcessedFunc(i,x):
for j in range(i,len(x)):
time.sleep(0.08)
M[i,j] = (x[i]+x[j])/2
print(M[i,:]) # Check if the operation works
print('')
processes = []
v = [x+1 for x in range(8000)]
M = np.zeros((len(v),len(v)))
for i in range(len(v)):
p = multiprocessing.Process(target = MultiProcessedFunc, args =(i,v))
processes.append(p)
p.start()
for process in processes:
process.join()
end = time.time()
print('Multiprocessing: {}'.format(end-start))
print(M)
Unfortunately your code wont work written in that way. Multiprocessing spawn separate processes, which means that the memory space are separate! Changes made by one subprocess will not be reflected in the other processes or your parent processes.
Strictly speaking this is not a scoping issue. Scope is something defined inside a single interpreter process.
The module does provide means of sharing memory between processes but this comes at a cost (shared memory is way slower due to locking issues and such.
Now, numpy has a nice feature: it releases the GIL during computation. This means that using multi threading
instead of multiprocessing
should give you some benefit with little other changes to your code, simply replace import multiprocessing
with import threading
and multiprocessing.Process
into threading.Thread
. The code should produce the correct result. On my machine, removing the print statements and the sleep
code it runs in under 8 seconds:
Multiprocessing: 7.48570203781
[[1.000e+00 1.000e+00 2.000e+00 ... 3.999e+03 4.000e+03 4.000e+03]
[0.000e+00 2.000e+00 2.000e+00 ... 4.000e+03 4.000e+03 4.001e+03]
[0.000e+00 0.000e+00 3.000e+00 ... 4.000e+03 4.001e+03 4.001e+03]
...
[0.000e+00 0.000e+00 0.000e+00 ... 7.998e+03 7.998e+03 7.999e+03]
[0.000e+00 0.000e+00 0.000e+00 ... 0.000e+00 7.999e+03 7.999e+03]
[0.000e+00 0.000e+00 0.000e+00 ... 0.000e+00 0.000e+00 8.000e+03]]
An alternative is to have your subprocesses return the result and then combine the results in your main process.