Python multiprocessing global numpy arrays

2019-08-20 09:11发布

I have following script:

max_number = 100000
minimums = np.full((max_number), np.inf, dtype=np.float32)
data = np.zeros((max_number, 128, 128, 128), dtype=np.uint8)

if __name__ == '__main__':
    main()

def worker(array, start, end):

    for in_idx in range(start, end):
        value = data[start:end][in_idx] # compute something using this array
        minimums[in_idx] = value

def main():

    jobs = []
    num_jobs = 5
    for i in range(num_jobs):
        start = int(i * (1000 / num_jobs))
        end = int(start + (1000 / num_jobs))

        p = multiprocessing.Process(name=('worker_' + str(i)), target=worker, args=(start, end))
        jobs.append(p)
        p.start()

    for proc in jobs:
        proc.join()
    print(jobs)

How can I ensure that the numpy array is global and can be accessed by each worker? Each worker uses a different part of the numpy array

1条回答
你好瞎i
2楼-- · 2019-08-20 09:44
import numpy as np
import multiprocessing as mp

ar = np.zeros((5,5))

def callback_function(result):
    x,y,data = result
    ar[x,y] = data

def worker(num):
    data = ar[num,num]+3
    return num, num, data

def apply_async_with_callback():
    pool = mp.Pool(processes=5)
    for i in range(5):
        pool.apply_async(worker, args = (i, ), callback = callback_function)
    pool.close()
    pool.join()
    print "Multiprocessing done!"

if __name__ == '__main__':
    ar = np.ones((5,5)) #This will be used, as local scope comes before global scope
    apply_async_with_callback()

Explanation: You set up your data array and your workers and callback functions. The number of processes in the pool set up a number of independent workers, where each worker can do more than one task. The callback writes the result back to the array.

The __name__=='__main__' protects the following line from being run at each import.

查看更多
登录 后发表回答