Pickle error on code for converting numpy array in

2019-08-04 13:28发布

问题:

Trying to use the code here https://stackoverflow.com/a/15390953/378594 to convert a numpy array into a shared memory array and back. Running the following code:

shared_array = shmarray.ndarray_to_shm(my_numpy_array)

and then passing the shared_array as an argument in the list of argument for a multiprocessing pool:

pool.map(my_function, list_of_args_arrays)

Where list_of_args_arrays contains my shared array and other arguments.

It results in the following error

PicklingError: Can't pickle <class 'multiprocessing.sharedctypes.c_double_Array_<array size>'>: attribute lookup multiprocessing.sharedctypes.c_double_Array_<array size> failed

Where <array_size> is the linear size of my numpy array.

I guess something has changed in numpy ctypes or something like that?

Further details:

I only need access to shared information. No editing will be done by the processes.

The function that calls the pool lies within a class. The class is initiated and the function is called by a main.py file.

回答1:

Apparently when using multiprocessing.Pool all arguments are pickled, and so there was no use using multiprocessing.Array. Changing the code so that it uses an array of processes did the trick.



回答2:

I think you are overcomplicating things: There is no need to pickle arrays (especially if they are read only):

you just need to do keep them accessible through some global variable:

(known to work in linux, but may not work in windows, don't know)

import numpy as np,multiprocessing as mp
class si:
  arrs=None

def summer(i):
    return si.arrs[i].sum()

def main():
    si.arrs=[np.zeros(100) for _ in range(1000)]
    pool = mp.Pool(16)
    res=pool.map(summer,range(1000))
    print res

if __name__ == '__main__':
    main()

If your arrays need to be read and written, you need to use this: Is shared readonly data copied to different processes for Python multiprocessing?