I'm trying to use multiprocessing
's Pool.map()
function to divide out work simultaneously. When I use the following code, it works fine:
import multiprocessing
def f(x):
return x*x
def go():
pool = multiprocessing.Pool(processes=4)
print pool.map(f, range(10))
if __name__== '__main__' :
go()
However, when I use it in a more object-oriented approach, it doesn't work. The error message it gives is:
PicklingError: Can't pickle <type 'instancemethod'>: attribute lookup
__builtin__.instancemethod failed
This occurs when the following is my main program:
import someClass
if __name__== '__main__' :
sc = someClass.someClass()
sc.go()
and the following is my someClass
class:
import multiprocessing
class someClass(object):
def __init__(self):
pass
def f(self, x):
return x*x
def go(self):
pool = multiprocessing.Pool(processes=4)
print pool.map(self.f, range(10))
Anyone know what the problem could be, or an easy way around it?
The problem is that multiprocessing must pickle things to sling them among processes, and bound methods are not picklable. The workaround (whether you consider it "easy" or not;-) is to add the infrastructure to your program to allow such methods to be pickled, registering it with the copy_reg standard library method.
For example, Steven Bethard's contribution to this thread (towards the end of the thread) shows one perfectly workable approach to allow method pickling/unpickling via
copy_reg
.The solution from parisjohn above works fine with me. Plus the code looks clean and easy to understand. In my case there are a few functions to call using Pool, so I modified parisjohn's code a bit below. I made call to be able to call several functions, and the function names are passed in the argument dict from
go()
:You could also define a
__call__()
method inside yoursomeClass()
, which callssomeClass.go()
and then pass an instance ofsomeClass()
to the pool. This object is pickleable and it works fine (for me)...Some limitations though to Steven Bethard's solution :
When you register your class method as a function, the destructor of your class is surprisingly called every time your method processing is finished. So if you have 1 instance of your class that calls n times its method, members may disappear between 2 runs and you may get a message
malloc: *** error for object 0x...: pointer being freed was not allocated
(e.g. open member file) orpure virtual method called, terminate called without an active exception
(which means than the lifetime of a member object I used was shorter than what I thought). I got this when dealing with n greater than the pool size. Here is a short example :Output:
The
__call__
method is not so equivalent, because [None,...] are read from the results :So none of both methods is satisfying...
Why not to use separate func?